Presented by Rachel Tillay and Mike Waugh, LSU Libraries
Is your data running loose in your library? OpenRefine is a tool that can help libraries more easily view, analyze, clean, and match large data sets. It is particularly useful for digital projects, statistics, or user data. This presentation will describe how OpenRefine is different from spreadsheets, datasheets, and programming. It will also include demonstrations of some of the most useful functions in OpenRefine, compatible tools, and solutions it provides to would-be data wranglers. Examples will include real-life problems that LSU Libraries has encountered in its cataloging and digital projects.
http://serai.utsc.utoronto.ca/rrsi2014
"Unlike traditional academic conferences, the Roots & Routes Summer Institute features a combination of informal presentations, seminar-style discussions of shared materials, hands-on workshops on a variety of digital tools, and small-group project development sessions. The institute welcomes participants from a range of disciplines with an interest in engaging with digital scholarship; technical experience is not a requirement. Graduate students (MA and PhD), postdoctoral fellows and faculty are all encouraged to apply."
Presented by Jennifer Hecker and Elizabeth Grumbach and hosted by the Texas Consortium on Digital Humanities, these are the slides for the TXDHC training webcast on OpenRefine, February 12th, 2015.
Central Pennsylvania Open Source Conference, October 17, 2015
Data is a hot topic in the tech sector with big data, data processing, data science, linked open data and data visualization to name only a few examples. Before data can be processed or analyzed it often has to be cleaned. OpenRefine is an open source interactive data transformation tool for working with messy data. This presentation will begin with a short overview of the features of OpenRefine. To demonstrate basic concepts of data cleaning, manipulating, faceting and filtering with OpenRefine, Pennsylvania Heritage magazine subject index data will be used as a case study.
Presented by Rachel Tillay and Mike Waugh, LSU Libraries
Is your data running loose in your library? OpenRefine is a tool that can help libraries more easily view, analyze, clean, and match large data sets. It is particularly useful for digital projects, statistics, or user data. This presentation will describe how OpenRefine is different from spreadsheets, datasheets, and programming. It will also include demonstrations of some of the most useful functions in OpenRefine, compatible tools, and solutions it provides to would-be data wranglers. Examples will include real-life problems that LSU Libraries has encountered in its cataloging and digital projects.
http://serai.utsc.utoronto.ca/rrsi2014
"Unlike traditional academic conferences, the Roots & Routes Summer Institute features a combination of informal presentations, seminar-style discussions of shared materials, hands-on workshops on a variety of digital tools, and small-group project development sessions. The institute welcomes participants from a range of disciplines with an interest in engaging with digital scholarship; technical experience is not a requirement. Graduate students (MA and PhD), postdoctoral fellows and faculty are all encouraged to apply."
Presented by Jennifer Hecker and Elizabeth Grumbach and hosted by the Texas Consortium on Digital Humanities, these are the slides for the TXDHC training webcast on OpenRefine, February 12th, 2015.
Central Pennsylvania Open Source Conference, October 17, 2015
Data is a hot topic in the tech sector with big data, data processing, data science, linked open data and data visualization to name only a few examples. Before data can be processed or analyzed it often has to be cleaned. OpenRefine is an open source interactive data transformation tool for working with messy data. This presentation will begin with a short overview of the features of OpenRefine. To demonstrate basic concepts of data cleaning, manipulating, faceting and filtering with OpenRefine, Pennsylvania Heritage magazine subject index data will be used as a case study.
Invited talk at USEWOD2014 (http://people.cs.kuleuven.be/~bettina.berendt/USEWOD2014/)
A tremendous amount of machine-interpretable information is available in the Linked Open Data Cloud. Unfortunately, much of this data remains underused as machine clients struggle to use the Web. I believe this can be solved by giving machines interfaces similar to those we offer humans, instead of separate interfaces such as SPARQL endpoints. In this talk, I'll discuss the Linked Data Fragments vision on machine access to the Web of Data, and indicate how this impacts usage analysis of the LOD Cloud. We all can learn a lot from how humans access the Web, and those strategies can be applied to querying and analysis. In particular, we have to focus first on solving those use cases that humans can do easily, and only then consider tackling others.
A presentation on Application Architecture for Semantic Web Applications based on chapter 4 of the book Semantic Web for the Working Ontologist by Dean Allemang and Jim Hendler. It focusses on RDF parsing and serialising and RDF stores.
Presentation of the paper "Creating 3rd Generation Web APIs with Hydra" at the 22nd Internation World Wide Web Conference (WWW2013) in Rio de Janeiro, Brazil
ISWC 2014 - Dandelion: from raw data to dataGEMs for developersSpazioDati
This is the presentation showed during ISWC 2014 at Riva del Garda. The session was titled "Developers Workshop", and the focus was on how you solved practical problems for Linked Data. We presented dandelion platform and our data curation workflow, and the overall idea of dataGEM APIs.
Invited talk at USEWOD2014 (http://people.cs.kuleuven.be/~bettina.berendt/USEWOD2014/)
A tremendous amount of machine-interpretable information is available in the Linked Open Data Cloud. Unfortunately, much of this data remains underused as machine clients struggle to use the Web. I believe this can be solved by giving machines interfaces similar to those we offer humans, instead of separate interfaces such as SPARQL endpoints. In this talk, I'll discuss the Linked Data Fragments vision on machine access to the Web of Data, and indicate how this impacts usage analysis of the LOD Cloud. We all can learn a lot from how humans access the Web, and those strategies can be applied to querying and analysis. In particular, we have to focus first on solving those use cases that humans can do easily, and only then consider tackling others.
A presentation on Application Architecture for Semantic Web Applications based on chapter 4 of the book Semantic Web for the Working Ontologist by Dean Allemang and Jim Hendler. It focusses on RDF parsing and serialising and RDF stores.
Presentation of the paper "Creating 3rd Generation Web APIs with Hydra" at the 22nd Internation World Wide Web Conference (WWW2013) in Rio de Janeiro, Brazil
ISWC 2014 - Dandelion: from raw data to dataGEMs for developersSpazioDati
This is the presentation showed during ISWC 2014 at Riva del Garda. The session was titled "Developers Workshop", and the focus was on how you solved practical problems for Linked Data. We presented dandelion platform and our data curation workflow, and the overall idea of dataGEM APIs.
The Power of Semantic Technologies to Explore Linked Open DataOntotext
Atanas Kiryakov's, Ontotext’s CEO, presentation at the first edition of Graphorum (http://graphorum2017.dataversity.net/) – a new forum that taps into the growing interest in Graph Databases and Technologies. Graphorum is co-located with the Smart Data Conference, organized by the digital publishing platform Dataversity.
The presentation demonstrates the capabilities of Ontotext’s own approach to contributing to the discipline of more intelligent information gathering and analysis by:
- graphically explorinh the connectivity patterns in big datasets;
- building new links between identical entities residing in different data silos;
- getting insights of what type of queries can be run against various linked data sets;
- reliably filtering information based on relationships, e.g., between people and organizations, in the news;
- demonstrating the conversion of tabular data into RDF.
Learn more at http://ontotext.com/.
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)Jeff Magnusson
Overview of the data platform as a service architecture at Netflix. We examine the tools and services built around the Netflix Hadoop platform that are designed to make access to big data at Netflix easy, efficient, and self-service for our users.
From the perspective of a user of the platform, we walk through how various services in the architecture can be used to build a recommendation engine. Sting, a tool for fast in memory aggregation and data visualization, and Lipstick, our workflow visualization and monitoring tool for Apache Pig, are discussed in depth. Lipstick is now part of Netflix OSS - clone it on github, or learn more from our techblog post: http://techblog.netflix.com/2013/06/introducing-lipstick-on-apache-pig.html.
In this webinar Thomas Cook, Sales Director, AnzoGraph DB, provides a history lesson on the origins of SPARQL, including its roots in the Semantic Web, and how linked open data is used to create Knowledge Graphs. Then, he dives into "What is RDF?", "What is a URI?" and "What is SPARQL?", wrapping up with a real-world demonstration via a Zeppelin notebook.
Donald Miner will do a quick introduction to Apache Hadoop, then discuss the different ways Python can be used to get the job done in Hadoop. This includes writing MapReduce jobs in Python in various different ways, interacting with HBase, writing custom behavior in Pig and Hive, interacting with the Hadoop Distributed File System, using Spark, and integration with other corners of the Hadoop ecosystem. The state of Python with Hadoop is far from stable, so we'll spend some honest time talking about the state of these open source projects and what's missing will also be discussed.
Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015Codemotion
Codemotion Rome 2015 - I Big Data sono indubbiamente tra i temi più "caldi" del panorama tecnologico attuale. Ad oggi nel mondo sono stati prodotti circa 5 Exabytes di dati che costituiscono una potenziale fonte di "intelligenza" che è possibile sfruttare, grazie alle tecnologie più recenti, in svariati ambiti che spaziano dalla medicina alla sociologia passando per il marketing. Il talk si propone, tramite una gita virtuale nello spazio, di introdurre i concetti, le tecniche e gli strumenti che consentono di iniziare a sfruttare il potenziale dei Big Data nel lavoro quotidiano.
The Business Case for Semantic Web Ontology & Knowledge GraphCambridge Semantics
In this webinar Mark Wallace, Ontologist & Developer, Semantic Arts, and Thomas Cook, Director of Sales AnzoGraph DB, Cambridge Semantics, explore the benefits of building a Semantic Knowledge Graph with RDF*, wrapping up with an airline data demo that illustrates the value of schema, inference and reasoning in it.
Why MongoDB over other Databases - HabilelabsHabilelabs
MongoDB is the faster-growing database. It is an open-source document and leading NoSQL database with the scalability and flexibility that you want with the querying and indexing that you need. In this Document, I presented why to choose MongoDB is over another database.
Big Data, a space adventure - Mario Cartia - Codemotion Milan 2014Codemotion
I Big Data sono indubbiamente tra i temi più "caldi" del panorama tecnologico attuale. Ad oggi nel mondo sono stati prodotti circa 5 Exabytes di dati che costituiscono una potenziale fonte di "intelligenza" che è possibile sfruttare, grazie alle tecnologie più recenti, in svariati ambiti che spaziano dalla medicina alla sociologia passando per il marketing. Il talk si propone, tramite una gita virtuale nello spazio, di introdurre i concetti, le tecniche e gli strumenti che consentono di iniziare a sfruttare il potenziale dei Big Data nel lavoro quotidiano.
Here is our most popular Hadoop Interview Questions and Answers from our Hadoop Developer Interview Guide. Hadoop Developer Interview Guide has over 100 REAL Hadoop Developer Interview Questions with detailed answers and illustrations asked in REAL interviews. The Hadoop Interview Questions listed in the guide are not "might be" asked interview question, they were asked in interviews at least once.
In this talk I will show Visualbox, a "visualization server" based on LODSPeaKr that can make easy for non javascript experts to create simple but meaningful visualizations.
Introduction to Hadoop.
What are Hadoop, MapReeduce, and Hadoop Distributed File System.
Who uses Hadoop?
How to run Hadoop?
What are Pig, Hive, Mahout?
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
OMG! My metadata is as fresh as the Backstreet Boys: How Google Refine can update, clean up and link your metadata to the wider world
1. OMG! MY METADATA IS AS
FRESH AS THE BACKSTREET
BOYS: HOW GOOGLE REFINE
CAN UPDATE, CLEAN UP AND
LINK YOUR METADATA TO THE
WIDER WORLD
SARAH BETH WEEKS
LIBRARY TECHNOLOGY CONFERENCE 2013
WEEKSS@STOLAF.EDU
@RASCALWHALE
2. SAMPLE PROJECT: NORDIC AMERICAN
IMPRINTS
Situation: Wanted to match publishers of our books against a
list of important Nordic American Publishers (compiled by Penny
Huf fman) to find materials for our special collections.
Problem: Hard to compare when publication info is not
controlled:
3. ANSWER: GOOGLE REFINE!
Google Refine can “match and
merge” messy data filled with:
Random, leading or trailing spaces
stray punctuation
typos
odd capitalization
and more!
24. END RESULT?
Using Google Refine we were able to reduce the
3230 unique values for city (260|a) to just 1153. For
publishers (260|b) we went from 11342 unique
names for publishers to approximately 6500.
This project helped to identify over 2,000 potential
candidates for our Nordic American Imprints
collection. (These are still being evaluated).
The controlled publishers, cities of publications and
dates will be added to a local 9xx field for faceting in
our future special collections discover tool. Users will
be able to browse our Nordic American Imprints
collection by publisher, city or state.
28. FOR THE REST CLICK THE OPTIONS TO
SEE WHAT EACH REPRESENTS
Then click “Match All Identical Cells” (or double checkmarks)
to link all cells with this text to this Freebase topic
29. OR “SEARCH FOR MATCH” TO BRING UP
AN AUTO-FILL LIST TO CHOOSE FROM
37. THANK YOU!
Questions?
Link to a public version of this presentation
at my (personal) blog:
gardenandalibrary.blogspot.com
I’m also happy to take questions by e-
mail
weekss@stolaf.edu