This document summarizes Gary Sieling's presentation on indexing historic videos in Solr. The presentation covered crawling videos from various sources, extracting metadata using tools like Cheerio.js, and solving challenges like handling propaganda, detecting years, aligning captions, and providing related content suggestions. The goal is to build a searchable platform for historical videos similar to Street View for exploring events from the past.
16. 16
01
Solution: Default Ranking
• List of good publishers
• De-rank "bad" talks (too technical or heavy topics)
• Daily randomization (compare to Reddit)
• Embeddable video
26. 26
01
Problem: Missing Captions
Smith-Waterman Alignment
00:40:32 push again and I'll say to them read my 00:40:37 lips - - -
- push again, and I'll say to them, "Read my - lips: no new taxes."
00:40:32 push again, and I'll say to them, "Read my 00:40:37 lips: no
new taxes."
39. 39
01
Outcome: Speakers + "Martin Luther King, Jr."
• Martin Luther King III
• Martin Luther
• Ralph Abernathy
• James Bevel
• Malcolm X
• Fred Shuttlesworth
• Stokely Carmichael
40. 40
01
Outcome: Topics + "Vietnam War"
• South Vietnam
• North Vietnam
• Military history of Australia during the Vietnam War
• Laotian Civil War
• Republic of Vietnam
• First Indochina War
• Cold War
41. 41
01
Storing Recommendations
Solr Core: Suggestions
Term: Ron Chernow
Suggestions: "['Joseph Ellis', 'Gordon S. Wood', 'Kevin
Phillips', 'David Brion Davis', 'David Levering Lewis', 'Paul
Collier', 'James B. Stewart', 'Ron Rosenbaum', 'Jean Edward
Smith', 'Gordon A. Craig']"