Presentation given at the 2018 BYU Conference on Family History and Genealogy. While record hinting has greatly increased the number of record sources attached to persons in FamilySearch Family Tree, many records are still only available as images and are not yet indexed to be searchable. This is especially true for non-English records. This presentation shows how FamilySearch is working to provide more findable, relevant, curated records for gathering multi-generational families from around the world by using Artificial Intelligence (AI) and other cutting edge technologies to greatly accelerate the number of historical records available to patrons.
The Coming Explosion of Records at FamilySearch Syllabusbakers84
Syllabus for the 2018 BYU Conference on Family History and Genealogy. While record hinting has greatly increased the number of record sources attached to persons in FamilySearch Family Tree, many records are still only available as images and are not yet indexed to be searchable. This is especially true for non-English records. This presentation shows how FamilySearch is working to provide more findable, relevant, curated records for gathering multi-generational families from around the world by using Artificial Intelligence (AI) and other cutting edge technologies to greatly accelerate the number of historical records available to patrons.
A Peek Under the Hood at FamilySearch - Presentationbakers84
Presentation for the 2018 BYU Conference on Family History and Genealogy. This presentation gives an inside look from a FamilySearch engineer on how features are prioritized, developed, tested and released. Insight into how user feedback is received and propagated through the organization will also be presented.
The Coming Explosion of Records at FamilySearch Syllabusbakers84
Syllabus for the 2018 BYU Conference on Family History and Genealogy. While record hinting has greatly increased the number of record sources attached to persons in FamilySearch Family Tree, many records are still only available as images and are not yet indexed to be searchable. This is especially true for non-English records. This presentation shows how FamilySearch is working to provide more findable, relevant, curated records for gathering multi-generational families from around the world by using Artificial Intelligence (AI) and other cutting edge technologies to greatly accelerate the number of historical records available to patrons.
A Peek Under the Hood at FamilySearch - Presentationbakers84
Presentation for the 2018 BYU Conference on Family History and Genealogy. This presentation gives an inside look from a FamilySearch engineer on how features are prioritized, developed, tested and released. Insight into how user feedback is received and propagated through the organization will also be presented.
American Art Collaborative Planning Grant Educational Briefings
Linked Data and Tools
Pedro Szekely - USC/Information Sciences Institute
September 30, 2014
Capacity Building: Data Science in the University At Rensselaer Polytechnic ...James Hendler
In this short talk, presented at the ITU's Capacity Building Symposium, I review some of the pedagogical innovation in data science happening at Rensselaer (RPI) and some aspects of teaching data science that are crucial to larger success.
Online text data for machine learning, data science, and research - Who can p...Fredrik Olsson
This slide deck concerns online text data for machine learning, artificial intelligence, data science, and scientific research. After this talk, you’ll know who can provide online text data, what types of data are hard to get, and principal data hygiene factors.
Updated in August 2019.
Professional Forum:
Eleanor Fink, American Art Collaborative, USA, Shane Richey, Crystal Bridges Museum of American Art, USA, Jeremy Tubbs, Indianapolis Museum of Art, USA, Rebecca Menendez, Autry Museum of the American West, USA, Cathryn Goodwin, Princeton University, USA
Last year the Andrew W. Mellon Foundation awarded a planning grant to the American Art Collaborative (AAC), a consortium of thirteen U.S. museums who have come together to learn about and implement LOD within their respective museums. Under the grant AAC developed a road map for the Initiative that will test LOD reconciliation issues, develop production and reconciliation tools, and result in the publication of American art holdings as LOD for researchers, educators, general public, aggregators such as DPLA, ResearchSpace, and digital application developers. The road map also includes publication of best practices and guidelines to share with the broader museum community.
In September 2015, AAC member Crystal Bridges Museum of American Art received on behalf of AAC, an IMLS National leadership grant and plans for additional grants are underway. These grants are allowing AAC to convert data to LOD using the CIDOC CRM, link to the Getty Vocabularies as well as contribute missing names to enhance the vocabularies, and implement an API and reader compliant with the International Image Interoperability Framework (IIIF) that will allow researchers to compare and contrast AAC LOD. Several open source tools including a link curation tool and IIIF/CRM translator will be developed and made available for other museums. AAC is developing its LOD under a federated model whereby each AAC member assumes responsibility for updating and maintaining its own data.
The session will bring together representatives from large as well as small AAC partners to discuss the benefits of LOD, some of the lessons learned and challenging documentation issues AAC is facing.
Bibliography:
American Alliance of Museums (Museum July/August 2016 Beyond the Hyperlink: Linked Open Data creates new opportunities;
http://www.club-innovation-culture.fr/emmanuelle-delmas-glass-yale-center-for-british-art-si-les-musees-ne-choisissent-pas-lopen-content-ils-deviendront-invisibles-et-inutiles/
A presentation for the Alaska Society for Technology in Education, based on my upcoming book, Cybertraps for Educators. It details the potential risks for teachers and administrators from the use and misuse of electronic resources.
Lessons Learned from Lod Failure and Big Data : The Future Trend Konkuk University
I discuss the failure of LOD and the reasons. From the lessons learned, LOD2 got launched four plus (4+) years ago and is about to the completed. What can you say about the future trend of Big Data from the lessons?
American Art Collaborative Linked Open Data presentation to "The Networked Cu...American Art Collaborative
An August 2017 presentation by Eleanor Fink to "The Networked Curator: Association of Art Museum Curators Foundation Digital Literacy Workshop for Art Curators"
Brief overview of open data, big data and sharing data ; discussion followed (based on Alastair Croll's presentation at ALA). robin fay @georgiawebgurl ; peter murray (lyrasis)
American Art Collaborative Planning Grant Educational Briefings
Linked Data and Tools
Pedro Szekely - USC/Information Sciences Institute
September 30, 2014
Capacity Building: Data Science in the University At Rensselaer Polytechnic ...James Hendler
In this short talk, presented at the ITU's Capacity Building Symposium, I review some of the pedagogical innovation in data science happening at Rensselaer (RPI) and some aspects of teaching data science that are crucial to larger success.
Online text data for machine learning, data science, and research - Who can p...Fredrik Olsson
This slide deck concerns online text data for machine learning, artificial intelligence, data science, and scientific research. After this talk, you’ll know who can provide online text data, what types of data are hard to get, and principal data hygiene factors.
Updated in August 2019.
Professional Forum:
Eleanor Fink, American Art Collaborative, USA, Shane Richey, Crystal Bridges Museum of American Art, USA, Jeremy Tubbs, Indianapolis Museum of Art, USA, Rebecca Menendez, Autry Museum of the American West, USA, Cathryn Goodwin, Princeton University, USA
Last year the Andrew W. Mellon Foundation awarded a planning grant to the American Art Collaborative (AAC), a consortium of thirteen U.S. museums who have come together to learn about and implement LOD within their respective museums. Under the grant AAC developed a road map for the Initiative that will test LOD reconciliation issues, develop production and reconciliation tools, and result in the publication of American art holdings as LOD for researchers, educators, general public, aggregators such as DPLA, ResearchSpace, and digital application developers. The road map also includes publication of best practices and guidelines to share with the broader museum community.
In September 2015, AAC member Crystal Bridges Museum of American Art received on behalf of AAC, an IMLS National leadership grant and plans for additional grants are underway. These grants are allowing AAC to convert data to LOD using the CIDOC CRM, link to the Getty Vocabularies as well as contribute missing names to enhance the vocabularies, and implement an API and reader compliant with the International Image Interoperability Framework (IIIF) that will allow researchers to compare and contrast AAC LOD. Several open source tools including a link curation tool and IIIF/CRM translator will be developed and made available for other museums. AAC is developing its LOD under a federated model whereby each AAC member assumes responsibility for updating and maintaining its own data.
The session will bring together representatives from large as well as small AAC partners to discuss the benefits of LOD, some of the lessons learned and challenging documentation issues AAC is facing.
Bibliography:
American Alliance of Museums (Museum July/August 2016 Beyond the Hyperlink: Linked Open Data creates new opportunities;
http://www.club-innovation-culture.fr/emmanuelle-delmas-glass-yale-center-for-british-art-si-les-musees-ne-choisissent-pas-lopen-content-ils-deviendront-invisibles-et-inutiles/
A presentation for the Alaska Society for Technology in Education, based on my upcoming book, Cybertraps for Educators. It details the potential risks for teachers and administrators from the use and misuse of electronic resources.
Lessons Learned from Lod Failure and Big Data : The Future Trend Konkuk University
I discuss the failure of LOD and the reasons. From the lessons learned, LOD2 got launched four plus (4+) years ago and is about to the completed. What can you say about the future trend of Big Data from the lessons?
American Art Collaborative Linked Open Data presentation to "The Networked Cu...American Art Collaborative
An August 2017 presentation by Eleanor Fink to "The Networked Curator: Association of Art Museum Curators Foundation Digital Literacy Workshop for Art Curators"
Brief overview of open data, big data and sharing data ; discussion followed (based on Alastair Croll's presentation at ALA). robin fay @georgiawebgurl ; peter murray (lyrasis)
How open data contribute to improving the world. The life science use case. The technical, social, ethical issues.
This was a talk given within the iGEM 2020 programme by the London Imperial College students group (https://2020.igem.org/Team:Imperial_College), in a webinar organised by the SOAPLab group on the topic of Ethics of Automation. Excellent Dr Brandon Sepulvado was the other speaker of the day.
Talk at a Data Journalism BootCamp organised by ICFJ, World Bank Group and African Media Initiative in New Delhi to a group of 60 journalists, coders and social sector folks. Other amazing sessions included those from Govind Ethiraj of IndiaSpend, Andrew from BBC, Parul from Google, Nasr from HacksHacker, Thej from DataMeet and David from Code for Africa. http://delhi.dbootcamp.org/
Slides from Wednesday 1st August - Data in the Scholarly Communications Life Cycle Course which is part of the FORCE11 Scholarly Communications Institute.
Presenter - Natasha Simons
Enterprise Search Share Point2009 Best Practices FinalMarianne Sweeny
This presentation examines features and benefits in Microsoft Office SharePoint Server (MOSS) 2007 enteprise search. It contains configuration guidance, code snippets, tips and tricks.
Beyond document retrieval using semantic annotations Roi Blanco
Traditional information retrieval approaches deal with retrieving full-text document as a response to a user's query. However, applications that go beyond the "ten blue links" and make use of additional information to display and interact with search results are becoming increasingly popular and adopted by all major search engines. In addition, recent advances in text extraction allow for inferring semantic information over particular items present in textual documents. This talks presents how enhancing a document with structures derived from shallow parsing is able to convey a different user experience in search and browsing scenarios, and what challenges we face as a consequence.
Leveraging the Consultant Planner - Presentationbakers84
Presentation for the 2018 BYU Conference on Family History and Genealogy. Intended to help LDS family history and temple consultants learn the basics of how to use the Consultant Planner to gain insights into others' families to make consultant visits more meaningful.
Leveraging the Consultant Planner Syllabusbakers84
Presentation for the 2018 BYU Conference on Family History and Genealogy. Intended to help family history and temple consultants learn the basics of how to use the Consultant Planner to gain insights into others' families to make consultant visits more meaningful.
A Peek Under the Hood at FamilySearch Syllabusbakers84
Presentation for the 2018 BYU Conference on Family History and Genealogy. This presentation gives an inside look from a FamilySearch engineer on how features are prioritized, developed, tested and released. Insight into how user feedback is received and propagated through the organization is also described.
Meaningful Family History In an Hour - Presentationbakers84
Presentation given at the 2017 BYU Conference on Family History and Genealogy. Help people, particularly LDS family history consultants, help others and themselves to be effective in doing family history in short periods of time.
Start and Grow Your Family Tree on FamilySearch.org - Presentationbakers84
Presentation at 2016 RootsTech conference. Learn how anyone can use the FREE resources on FamilySearch.org to build their family tree in a collaborative, source-based manner.
Covers the following areas:
- What is FamilySearch Family Tree?
- What are the benefits of a public tree?
- How to navigate and add to the tree
- Basics on working with others on family tree
Help! My Family Is All Messed Up on FamilySearch Family Tree!bakers84
For various reasons including the origins of data and the collaborative nature of FamilySearch Family Tree, there are many situations where incorrect data may exist in your family tree. This presentation will help users to learn from an experienced FamilySearch engineer strategies to understand and resolve commonly seen bad data situations.
FamilySearch Insider Tips and Tricks - Syllabusbakers84
There are many powerful tools available on FamilySearch.org. Many features of these tools are not well documented in manuals or easily discovered in the products themselves. This is an outline of what will be part of the accompanying presentation to show some tips and tricks from a FamilySearch engineer to be more productive in using the resources on FamilySearch.org.
FamilySearch Insider Tips and Tricks - Presentationbakers84
There are many powerful tools available on FamilySearch.org. Many features of these tools are not well documented in manuals or easily discovered in the products themselves. This presentation shows some tips and tricks from a FamilySearch engineer to be more productive in using the resources on FamilySearch.org.
I gave this presentation at the 2014 BYU Conference on Family History and Genealogy. While there are some portions of the presentation that are not yet complete, I decided to upload the presentation as is and plan on updating it in the near future with additional information.
Finding 'My Tree' Within FamilySearch Family Tree's 'Our Tree'bakers84
FamilySearch’s Family Tree is an important step forward in open collaboration with the ultimate goal of a single tree of all mankind. While a powerful paradigm, many crave better visibility into their portion of “Our Tree”. This presentation shows how existing features and new research can help uncover “My Tree” within the larger Family Tree.
I gave this presentation at the 2014 BYU Conference on Family History and Genealogy.
A Whirlwind Tour of FamilySearch Resources - 2013 Presentationbakers84
This is presentation I gave at the 2013 BYU Conference on Family History and Genealogy. It gives a high-level overview of available resources on familysearch.org. A corresponding document with URL links to the pages shown in the presentation has also been uploaded to SlideShare.
Merging People in FamilySearch Family Tree - Presentationbakers84
This is a presentation originally given at the 2013 BYU Conference on Family History and Genealogy and again with updated content for a presentation at the 2017 BYU Conference on Family History. This presentation helps users become more familiar with and successful using the merge-related features of FamilySearch's Family Tree.
A Whirlwind Tour of FamilySearch Resources - 2013 URL Listbakers84
This is the syllabus materials for a presentation I gave at the 2013 BYU Conference on Family History and Genealogy. It gives a high-level overview of available resources on familysearch.org. This document gives links to many pages within familysearch.org. A corresponding presentation showing screenshots has also been uploaded to SlideShare.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
2. Background
• Over 8½ years as a Software Engineer at FamilySearch
• Currently on the Automated Content Extraction team
• Try to do my own genealogy and help others
• Hope I’ll be able to help you see a vision of the future
• Go to https://www.slideshare.net/bakers84 or e-mail me
(bakerb@familysearch.org) to get a copy of this
presentation
• Click here for the related printed handout materials
3. First, Some Basics
Good News
• FamilySearch published its 2
billionth image in April 2018
• The 1 billionth image was
published in June 2014
• FamilySearch continues to
digitize nearly 1M images per
day from microfilm and about 320
cameras worldwide
• Family has nearly 6.4B indexed
names of people in records
• Record hinting has already made
FamilySearch Family Tree the
most well sourced tree in the
world with over 1B sources
attached to persons in the tree
Bad News
• Many records are only available
as images via the catalog. Only a
fraction of records have been
indexed
• Indexing isn’t keeping up with the
ability to digitize images,
especially in non-English
languages
• Current available record images
do not match church membership
in some areas
• Only indexed records can be
presented as record hints
4. Historical Records Images by Region at
FamilySearch
North America Europe and Middle East Latin America
Other Asia Africa/Pacific
LDS Church Membership by Region
North America Europe and Middle East Latin America
Other Asia Africa/Pacific
5. Changing the Records Publication
Paradigm
• Several teams at FamilySearch are dedicated to improving the
records publication platform
• The Goal: Provide more findable, relevant, curated records for
gathering multi-generational families from around the world
• Want to publish and make hintable 20% of the top tier records
in 50 of the highest priority countries within 15 years
• 58% coverage in North America as of 2017
• Crossed 20% in 3 more countries in 2017 (Denmark, Finland and Sweden)
• Major release of Mexican records in 2018
• Seeking to allow homelands to be more involved in building
local content
• Will support user corrections to records and indexing on-the-fly
• Will use automated technologies to accelerate publication
7. First Mini-Explosion
• Partnership with GenealogyBank to extract
data from born digital obituaries
• First run indexed 5M obituaries in 10 hours,
saving about 150 man-years of indexing
• 23M obituaries indexed as of May 2018,
many more coming
• Uses recent advancements in machine
learning and artificial intelligence (AI)
• Can produce even more information than
indexing (Ex. In-law couple relationships)
10. What is Being Done Now
• Refining research code and models to be more
stable, reproducible and measurable
• Support ability to publish 1M obituaries a month
now, continuing to increase
• Built on scalable Amazon Web Services to meet
any future demands
11. How are Artificial Intelligence,
Machine Learning and Deep
Learning Related?
Artificial Intelligence – Machines exhibiting
human intelligence
• General AI – still science fiction
• Narrow AI – technologies that perform
specific tasks as well or better than humans
Machine Learning – Practice of using algorithms
to parse data, learn from it, and then make a
determination or prediction about something
in the world
Deep Learning – Using much larger machine
learning neural networks requiring more
training data and computational power
Artificial Intelligence
Machine
Learning
Deep
Learning
12. Machine Learning Isn’t Really New
• Been around for decades
• Spam filters in 1990s
• OCR (Optical Character Recognition)
• FamilySearch already uses for some things
• Match classifier
• Possible duplicates (person – person)
• Record hinting (person – record)
• FamilySearch is beginning to explore new uses
• Research Team -> Automated Content Extraction
• Exploring Deep Learning and other methods to automatically
understand historical documents
13. How is Machine Learning different
from traditional programming?
Machine Learning is using computers so they can learn
from data instead of writing rules (i.e. code) to solve
problems
Study the
Problem
Write Rules Evaluate
Launch!
Analyze
Errors
Study the
Problem
Train ML
Algorithm
Evaluate
Launch!
Analyze
Errors
Data
14. Necessary Technologies
• Natural Language Processing (NLP)
• Named entity recognition (NER) – identify the names,
dates, places, etc.
• Relation extraction – identify relationships between the
names, dates & places
• Additional processing to get into format for
publication, standardize data, etc.
• Notice the steps are similar to what a
genealogist would do
17. Document Type Record Type Language Status in May 2018
Digital text Obituaries English Already published 23M
Working to continuously publish
Typewritten
newspaper text
Obituaries English Active research
Handwritten text Wills and deeds English Active research
Handwritten
calligraphy
Genealogies Chinese Preliminary research
Handwritten text Church records Spanish Preliminary research
More document
types
More record
types
More
languages
Expect future “explosions”
19. What You Can Do
• Keep Indexing
• It is still valuable, especially in non-English languages
• Remember indexed data is the foundation for training machines
to auto-index correctly
• We’ll also likely continue to use human indexing to continue to
measure how the machines are doing
• Understand your role in correcting records that
have been automatically indexed incorrectly
• Be patient as solutions continue to expand,
perhaps on collections that don’t benefit your
research, remembering we are a global church
• Pray for the Lord’s help to bless these efforts
21. “We always overestimate the change that will
occur in the next two years and
underestimate the change that will occur in the
next ten.”
Bill Gates
22. Tale of Three Decades
1998-2007 – Laying the technological foundation
1996 – GEDCOM 5.5 standard released (still supported)
1999 – PAF 4.0 – First Windows version
2002 – PAF 5.2 – Last major version
2004 – First vault microfilms converted to digital images
2007 – First digital images from the vault published on FamilySearch.org
2008-2017 – Single publicly available tree integrated with historical records
2010 – Launch of FamilySearch record search (>1B names, millions of images)
2006 – FamilySearch indexing began
2007 – FamilySearch Research wiki started
2009 – new.familysearch became available in Utah (limited rollout began in 2007)
2009 – I began to work at FamilySearch
2011 – RootsTech conference began
2013 – Family Tree added – made available to non-LDS patrons
2013 – Memories (photos & stories) initial rollout
2014 – Partnerships with Ancestry, MyHeritage and FindMyPast
2014 – Record hinting
2014 – First FamilySearch mobile app released
2015 – User to User Messaging
2015 – Printing temple cards from home in 44 languages
2016 – Family Tree moved to scalable servers
2017 – Web indexing
2018 – Family Tree Lite
2018-2027 – Worldwide explosion of records
2017 – Nordic Records – Year of the Viking Scandanavian (Sweden, Denmark, Finland) first 3 of top 50 countries
2018 – Mexican Civil Records project – 60M records
???? – Billions more indexed records made available via automatic indexing technologies
???? – User corrections of records supported
???? – DNA Features?
???? – ????
23. Thank you!
I hope you’ve been inspired
Keep an eye out for more explosions