Demystifying Digital Humanities: Winter 2014 session #1Paige Morgan
Slides from the January 18th Demystifying Digital Humanities workshop on Exploring Programming in the Humanities, held at the Simpson Center for the Humanities, and taught by Paige Morgan, Sarah Kremen-Hicks, and Brian Gutierrez
This document summarizes an introductory session on programming in the digital humanities. It discusses how programming involves complex work in figuring out what to do and which languages to use. Examples are provided of tasks a programming language could perform on text data, like finding quotes from a novel or allowing a user to search a text file. The document emphasizes that critical thinking is important to programming in the humanities. It also discusses different ways of structuring data, such as with markup languages like HTML and TEI, or in a structured format like a database. The goal is to make data understandable to computers while retaining its usefulness. Collaboration is important when creating structured data.
This document summarizes a workshop on programming with data. It discusses preparing data by structuring it into categories and relationships. Examples are given of literary mapping projects that encode spatial and prosodic data from texts. Programming tasks for these projects are outlined, such as counting feet in poems and identifying deviation patterns. The document emphasizes thinking through one's data and goals before choosing tools or languages in order to focus learning.
An Overview of the area and the current potential for the open technologies to be used, and some suggestions as to why they are not as heavily used as they should be.
Welcome to the Brixton Library Technology InitiativeBasil Bibi
This document introduces a Python coding initiative at the Brixton Library for adults. It provides information about meeting times and contacts, as well as a detailed overview of the Python programming language, its history and uses. Participants are encouraged to register for an associated free online Coursera course and attend Saturday sessions at the library for assistance and collaboration.
folksonomy, social tagging, tag clouds, automatic folksonomy construction, word clouds, wordle,context-preserving word cloud visualisation, CPEWCV, seam carving, inflate and push, star forest, cycle cover, quantitative metrics, realized adjacencies, distortion, area utilization, compactness, aspect ratio, running time, semantics in language technology
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...Grammarly
Speaker: Jordi Carrera Ventura, Artificial Intelligence technologist at Telefónica R&D
Summary: Chatbots (aka conversational agents, spoken dialogue systems) allow users to interface with computers using natural language by simply asking questions or issuing commands.
Given a query, the chatbot builds a semantic representation of the input, transforms it into a logical statement, and performs all the necessary actions to fulfill the user's intent. Sometimes this simply means calculating an exact answer or retrieving a fact from a database, whereas other times it means building a contextual model and running a full-fledged conversation flow while keeping track of anaphoras and cross-references.
Besides the direct applications of chatbots in IoT (Amazon’s Alexa, Apple's Siri) and IT (the historical field of Information Retrieval as a whole can be seen as a sub-problem of spoken dialogue systems), chatbots' main appeal for technologists is their location at the intersection of all major Natural Language Processing technologies and many of the deepest questions in Cognitive Science today: semantic parsing, entity recognition, knowledge representation, and coreference resolution.
In this talk, I will explore those questions in the context of an applied industry setting, and I will introduce a framework suitable for addressing them, together with an overview of the state-of-the-art in chatbot technology and some original techniques.
Demystifying Digital Humanities: Winter 2014 session #1Paige Morgan
Slides from the January 18th Demystifying Digital Humanities workshop on Exploring Programming in the Humanities, held at the Simpson Center for the Humanities, and taught by Paige Morgan, Sarah Kremen-Hicks, and Brian Gutierrez
This document summarizes an introductory session on programming in the digital humanities. It discusses how programming involves complex work in figuring out what to do and which languages to use. Examples are provided of tasks a programming language could perform on text data, like finding quotes from a novel or allowing a user to search a text file. The document emphasizes that critical thinking is important to programming in the humanities. It also discusses different ways of structuring data, such as with markup languages like HTML and TEI, or in a structured format like a database. The goal is to make data understandable to computers while retaining its usefulness. Collaboration is important when creating structured data.
This document summarizes a workshop on programming with data. It discusses preparing data by structuring it into categories and relationships. Examples are given of literary mapping projects that encode spatial and prosodic data from texts. Programming tasks for these projects are outlined, such as counting feet in poems and identifying deviation patterns. The document emphasizes thinking through one's data and goals before choosing tools or languages in order to focus learning.
An Overview of the area and the current potential for the open technologies to be used, and some suggestions as to why they are not as heavily used as they should be.
Welcome to the Brixton Library Technology InitiativeBasil Bibi
This document introduces a Python coding initiative at the Brixton Library for adults. It provides information about meeting times and contacts, as well as a detailed overview of the Python programming language, its history and uses. Participants are encouraged to register for an associated free online Coursera course and attend Saturday sessions at the library for assistance and collaboration.
folksonomy, social tagging, tag clouds, automatic folksonomy construction, word clouds, wordle,context-preserving word cloud visualisation, CPEWCV, seam carving, inflate and push, star forest, cycle cover, quantitative metrics, realized adjacencies, distortion, area utilization, compactness, aspect ratio, running time, semantics in language technology
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...Grammarly
Speaker: Jordi Carrera Ventura, Artificial Intelligence technologist at Telefónica R&D
Summary: Chatbots (aka conversational agents, spoken dialogue systems) allow users to interface with computers using natural language by simply asking questions or issuing commands.
Given a query, the chatbot builds a semantic representation of the input, transforms it into a logical statement, and performs all the necessary actions to fulfill the user's intent. Sometimes this simply means calculating an exact answer or retrieving a fact from a database, whereas other times it means building a contextual model and running a full-fledged conversation flow while keeping track of anaphoras and cross-references.
Besides the direct applications of chatbots in IoT (Amazon’s Alexa, Apple's Siri) and IT (the historical field of Information Retrieval as a whole can be seen as a sub-problem of spoken dialogue systems), chatbots' main appeal for technologists is their location at the intersection of all major Natural Language Processing technologies and many of the deepest questions in Cognitive Science today: semantic parsing, entity recognition, knowledge representation, and coreference resolution.
In this talk, I will explore those questions in the context of an applied industry setting, and I will introduce a framework suitable for addressing them, together with an overview of the state-of-the-art in chatbot technology and some original techniques.
Gadgets pwn us? A pattern language for CALLLawrie Hunter
The document discusses creating a pattern language for computer-assisted language learning (CALL). It explores the concept of a pattern language as defined by Christopher Alexander and proposes a framework for creating a CALL pattern language in the era of web 2.0. The paper seeks to rework concepts from other fields, like "formal learning design expression" and "task arc," and have participants brainstorm elements to include through graphical challenges. The overall goal is to establish foundational patterns for CALL work.
The document proposes a collaborative ontology building project (COB) that uses a multi-agent approach to facilitate distributed ontology editing and discovery. Key challenges addressed include making ontology editing easy for non-experts, enabling iterative ontology evolution through expert and agent cooperation, and facilitating ontology mining from distributed and dynamic data sources on the web. The proposed system design involves an ontology repository, various human and software agents that contribute to and validate ontologies, and techniques for tasks like ontology alignment and redundancy/conflict checking.
Nautral Langauge Processing - Basics / Non Technical Dhruv Gohil
This document provides an overview of natural language processing (NLP) and discusses several NLP applications. It introduces NLP and how it helps computers understand human language through examples like Apple's Siri and Google Now. It then summarizes popular NLP toolkits and describes applications including text summarization, information extraction, sentiment analysis, and dialog systems. The document concludes by discussing NLP system development, testing, and evaluation.
This was presented to software developers with the goal of introducing them to basic machine learning workflow, code snippets, possibilities and state-of-the-art in NLP and give some clues on where to get started.
Distributed Natural Language Processing Systems in PythonClare Corthell
Much of human knowledge is “locked up” in a type of data called text. Humans are great at reading, but are computers? This workshop leads you through open source data science libraries in Python that turn text into valuable data, then tours an open source system built for the Wordnik dictionary to source definitions of words from across the internet.
Thinking Machines Conference, Manila, February 2016
http://thinkingmachin.es/events/
This document summarizes a book titled "Data Structures & Algorithms in Java" by Robert Lafore. The book is 617 pages and introduces readers to manipulating data in practical ways using Java examples. It describes how the book uses animated Java programs called Workshop applets to visually demonstrate complex data structures and algorithms topics.
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Fwdays
In this talk I'll start by introducing the recent breakthroughs in NLP that resulted from the combination of Transfer Learning schemes and Transformer architectures. The second part of the talk will be dedicated to an introduction of the open-source tools released by Hugging Face, in particular our transformers, tokenizers, and NLP libraries as well as our distilled and pruned models.
The Role of Natural Language Processing in Information RetrievalTony Russell-Rose
The document discusses the role of natural language processing (NLP) in information retrieval. It provides background on NLP, describing some of the fundamental problems in processing text like ambiguity and the contextual nature of language. It then outlines several common NLP tools and techniques used to analyze text at different levels, from part-of-speech tagging to named entity recognition and information extraction. The document concludes that NLP can help address some of the limitations of traditional document retrieval models by identifying implicit meanings and relationships within text.
OpenAI’s GPT 3 Language Model - guest Steve OmohundroNumenta
In this research meeting, guest Stephen Omohundro gave a fascinating talk on GPT-3, the new massive OpenAI Natural Language Processing model. He reviewed the network architecture, training process, and results in the context of past work. There was extensive discussion on the implications for NLP and for Machine Intelligence / AGI.
Link to GPT-3 paper: https://arxiv.org/abs/2005.14165
Link to YouTube recording of Steve's talk: https://youtu.be/0ZVOmBp29E0
Learn Data Science with Python course for B.TECH, BCA, MCA, BSC, MSC, B.COM, and statistical students. Data Science with python online training course with certified industry experts. Get a 100 % pre-placement guarantee.
1. BlenderBot summarizes several papers on chatbot models from Google, OpenAI, and FAIR to provide context on its contributions.
2. It describes its use of large pre-training datasets like Reddit comments, and fine-tuning on datasets for personality, empathy, knowledge, and blended skills.
3. The paper considers retrieval, generative, and retrieve-and-refine models, selecting the Poly-Encoder for retrieval and BART for generation due to their advantages, and exploring techniques like unlikelihood training and decoding strategies.
InftyReader is an Optical Character Recognition (OCR) application that automatically converts "inaccessible" math content such as: 1) printed textbooks containing mathematics; 2) images containing mathematics; and, 3) PDF files that containing mathematics into formats that are accessible by students with "print disabilities." These formats include LaTeX, MathML, and Word XML. A "print disability" is a condition related to blindness, visual impairment, specific learning disability or other physical condition in which the student needs an alternative or specialized format (i.e., Braille, Large Print, Audio, Digital text) in order to access and acquire knowledge from conventional print/digital materials.
ChattyInfty 3 is a talking math editor. It can be used to edit files processed by InftyReader. Once editing is complete, ChattyInfty 3 can export files into a wide range of accessible formats including:
1. Spoken Text
2. DAISY 2.02 multimedia
3. DAISY 2.02 audio
4. DAISY 3 multimedia
5. DAISY 3 text (with audio for math)
6. DAISY 3 text-only
7. EPUB3 media overlays
8. EPUB3 no audio
9. EPUB3 iBooks media overlays
Natural Language Processing: L01 introductionananth
This presentation introduces the course Natural Language Processing (NLP) by enumerating a number of applications, course positioning, challenges presented by Natural Language text and emerging approaches to topics like word representation.
This document introduces transfer learning and its importance for natural language processing (NLP). It discusses how transfer learning allows knowledge gained from one task or domain to be applied to another, similar to how humans learn. Large companies can train complex neural networks on vast datasets, but this requires massive resources. Transfer learning addresses this by enabling models pretrained on huge datasets to be fine-tuned and applied to new problems with far fewer resources. This paradigm has been key to advancing and democratizing NLP techniques.
Speaker: Vitalii Braslavskyi, Software Engineer at Grammarly
Summary:
Today, the dominant approach to software engineering is an imperative one — the best practices have been proven over time. But the world is always evolving, and in order to evolve with it and remain as productive as possible, we need to continue searching for better tools to solve problems of increasing complexity.
In this talk, we'll discuss the tools and techniques of the .Net ecosystem that can help us to concentrate on the problem itself — not just on the intermediate steps (which have likely already been solved). We'll compare imperative and declarative approaches and assess solutions to problems.
We'll also offer examples of how engineers in Grammarly's Office Add-in team use these tools to improve the efficiency of our engineering and strengthen our solutions to the problems at hand.
Words and sentences are the basic units of text. In this lecture we discuss basics of operations on words and sentences such as tokenization, text normalization, tf-idf, cosine similarity measures, vector space models and word representation
In this presentation we discuss several concepts that include Word Representation using SVD as well as neural networks based techniques. In addition we also cover core concepts such as cosine similarity, atomic and distributed representations.
Lecture 1: Semantic Analysis in Language TechnologyMarina Santini
This document provides an introduction to a course on semantic analysis in language technology taught at Uppsala University in Sweden. It outlines the course website, contact information for the instructor, intended learning outcomes, required readings, assignments and examination. The course focuses on applying semantic analysis methods in natural language processing tasks like sentiment analysis, information extraction, word sense disambiguation and predicate-argument extraction. It will introduce students to representing and modeling meaning in language through formal logics and semantic frameworks.
Introduction to natural language processingMinh Pham
This document provides an introduction to natural language processing (NLP). It discusses what NLP is, why NLP is a difficult problem, the history of NLP, fundamental NLP tasks like word segmentation, part-of-speech tagging, syntactic analysis and semantic analysis, and applications of NLP like information retrieval, question answering, text summarization and machine translation. The document aims to give readers an overview of the key concepts and challenges in the field of natural language processing.
The document provides information on the planets in our solar system. It describes the Sun as composed primarily of hydrogen and helium, and that it converts hydrogen to helium in its core through nuclear fusion. It then summarizes each planet individually, noting key facts about their composition, size, orbital characteristics, and moons.
The document provides information about the 8 planets in our solar system (Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune) as well as Pluto. It notes key facts about each planet such as their order from the sun, their composition, number of moons, and other distinguishing features. Mercury is the smallest planet and has the longest day. Venus is Earth's neighbor. Earth is the only known planet to support life. Mars is cold and named after the Roman god of war. Jupiter is the largest planet and has a Great Red Spot. Saturn has rings made of ice and dust. Uranus is tipped on its side. Neptune is stormy with faint rings and
Gadgets pwn us? A pattern language for CALLLawrie Hunter
The document discusses creating a pattern language for computer-assisted language learning (CALL). It explores the concept of a pattern language as defined by Christopher Alexander and proposes a framework for creating a CALL pattern language in the era of web 2.0. The paper seeks to rework concepts from other fields, like "formal learning design expression" and "task arc," and have participants brainstorm elements to include through graphical challenges. The overall goal is to establish foundational patterns for CALL work.
The document proposes a collaborative ontology building project (COB) that uses a multi-agent approach to facilitate distributed ontology editing and discovery. Key challenges addressed include making ontology editing easy for non-experts, enabling iterative ontology evolution through expert and agent cooperation, and facilitating ontology mining from distributed and dynamic data sources on the web. The proposed system design involves an ontology repository, various human and software agents that contribute to and validate ontologies, and techniques for tasks like ontology alignment and redundancy/conflict checking.
Nautral Langauge Processing - Basics / Non Technical Dhruv Gohil
This document provides an overview of natural language processing (NLP) and discusses several NLP applications. It introduces NLP and how it helps computers understand human language through examples like Apple's Siri and Google Now. It then summarizes popular NLP toolkits and describes applications including text summarization, information extraction, sentiment analysis, and dialog systems. The document concludes by discussing NLP system development, testing, and evaluation.
This was presented to software developers with the goal of introducing them to basic machine learning workflow, code snippets, possibilities and state-of-the-art in NLP and give some clues on where to get started.
Distributed Natural Language Processing Systems in PythonClare Corthell
Much of human knowledge is “locked up” in a type of data called text. Humans are great at reading, but are computers? This workshop leads you through open source data science libraries in Python that turn text into valuable data, then tours an open source system built for the Wordnik dictionary to source definitions of words from across the internet.
Thinking Machines Conference, Manila, February 2016
http://thinkingmachin.es/events/
This document summarizes a book titled "Data Structures & Algorithms in Java" by Robert Lafore. The book is 617 pages and introduces readers to manipulating data in practical ways using Java examples. It describes how the book uses animated Java programs called Workshop applets to visually demonstrate complex data structures and algorithms topics.
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Fwdays
In this talk I'll start by introducing the recent breakthroughs in NLP that resulted from the combination of Transfer Learning schemes and Transformer architectures. The second part of the talk will be dedicated to an introduction of the open-source tools released by Hugging Face, in particular our transformers, tokenizers, and NLP libraries as well as our distilled and pruned models.
The Role of Natural Language Processing in Information RetrievalTony Russell-Rose
The document discusses the role of natural language processing (NLP) in information retrieval. It provides background on NLP, describing some of the fundamental problems in processing text like ambiguity and the contextual nature of language. It then outlines several common NLP tools and techniques used to analyze text at different levels, from part-of-speech tagging to named entity recognition and information extraction. The document concludes that NLP can help address some of the limitations of traditional document retrieval models by identifying implicit meanings and relationships within text.
OpenAI’s GPT 3 Language Model - guest Steve OmohundroNumenta
In this research meeting, guest Stephen Omohundro gave a fascinating talk on GPT-3, the new massive OpenAI Natural Language Processing model. He reviewed the network architecture, training process, and results in the context of past work. There was extensive discussion on the implications for NLP and for Machine Intelligence / AGI.
Link to GPT-3 paper: https://arxiv.org/abs/2005.14165
Link to YouTube recording of Steve's talk: https://youtu.be/0ZVOmBp29E0
Learn Data Science with Python course for B.TECH, BCA, MCA, BSC, MSC, B.COM, and statistical students. Data Science with python online training course with certified industry experts. Get a 100 % pre-placement guarantee.
1. BlenderBot summarizes several papers on chatbot models from Google, OpenAI, and FAIR to provide context on its contributions.
2. It describes its use of large pre-training datasets like Reddit comments, and fine-tuning on datasets for personality, empathy, knowledge, and blended skills.
3. The paper considers retrieval, generative, and retrieve-and-refine models, selecting the Poly-Encoder for retrieval and BART for generation due to their advantages, and exploring techniques like unlikelihood training and decoding strategies.
InftyReader is an Optical Character Recognition (OCR) application that automatically converts "inaccessible" math content such as: 1) printed textbooks containing mathematics; 2) images containing mathematics; and, 3) PDF files that containing mathematics into formats that are accessible by students with "print disabilities." These formats include LaTeX, MathML, and Word XML. A "print disability" is a condition related to blindness, visual impairment, specific learning disability or other physical condition in which the student needs an alternative or specialized format (i.e., Braille, Large Print, Audio, Digital text) in order to access and acquire knowledge from conventional print/digital materials.
ChattyInfty 3 is a talking math editor. It can be used to edit files processed by InftyReader. Once editing is complete, ChattyInfty 3 can export files into a wide range of accessible formats including:
1. Spoken Text
2. DAISY 2.02 multimedia
3. DAISY 2.02 audio
4. DAISY 3 multimedia
5. DAISY 3 text (with audio for math)
6. DAISY 3 text-only
7. EPUB3 media overlays
8. EPUB3 no audio
9. EPUB3 iBooks media overlays
Natural Language Processing: L01 introductionananth
This presentation introduces the course Natural Language Processing (NLP) by enumerating a number of applications, course positioning, challenges presented by Natural Language text and emerging approaches to topics like word representation.
This document introduces transfer learning and its importance for natural language processing (NLP). It discusses how transfer learning allows knowledge gained from one task or domain to be applied to another, similar to how humans learn. Large companies can train complex neural networks on vast datasets, but this requires massive resources. Transfer learning addresses this by enabling models pretrained on huge datasets to be fine-tuned and applied to new problems with far fewer resources. This paradigm has been key to advancing and democratizing NLP techniques.
Speaker: Vitalii Braslavskyi, Software Engineer at Grammarly
Summary:
Today, the dominant approach to software engineering is an imperative one — the best practices have been proven over time. But the world is always evolving, and in order to evolve with it and remain as productive as possible, we need to continue searching for better tools to solve problems of increasing complexity.
In this talk, we'll discuss the tools and techniques of the .Net ecosystem that can help us to concentrate on the problem itself — not just on the intermediate steps (which have likely already been solved). We'll compare imperative and declarative approaches and assess solutions to problems.
We'll also offer examples of how engineers in Grammarly's Office Add-in team use these tools to improve the efficiency of our engineering and strengthen our solutions to the problems at hand.
Words and sentences are the basic units of text. In this lecture we discuss basics of operations on words and sentences such as tokenization, text normalization, tf-idf, cosine similarity measures, vector space models and word representation
In this presentation we discuss several concepts that include Word Representation using SVD as well as neural networks based techniques. In addition we also cover core concepts such as cosine similarity, atomic and distributed representations.
Lecture 1: Semantic Analysis in Language TechnologyMarina Santini
This document provides an introduction to a course on semantic analysis in language technology taught at Uppsala University in Sweden. It outlines the course website, contact information for the instructor, intended learning outcomes, required readings, assignments and examination. The course focuses on applying semantic analysis methods in natural language processing tasks like sentiment analysis, information extraction, word sense disambiguation and predicate-argument extraction. It will introduce students to representing and modeling meaning in language through formal logics and semantic frameworks.
Introduction to natural language processingMinh Pham
This document provides an introduction to natural language processing (NLP). It discusses what NLP is, why NLP is a difficult problem, the history of NLP, fundamental NLP tasks like word segmentation, part-of-speech tagging, syntactic analysis and semantic analysis, and applications of NLP like information retrieval, question answering, text summarization and machine translation. The document aims to give readers an overview of the key concepts and challenges in the field of natural language processing.
The document provides information on the planets in our solar system. It describes the Sun as composed primarily of hydrogen and helium, and that it converts hydrogen to helium in its core through nuclear fusion. It then summarizes each planet individually, noting key facts about their composition, size, orbital characteristics, and moons.
The document provides information about the 8 planets in our solar system (Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune) as well as Pluto. It notes key facts about each planet such as their order from the sun, their composition, number of moons, and other distinguishing features. Mercury is the smallest planet and has the longest day. Venus is Earth's neighbor. Earth is the only known planet to support life. Mars is cold and named after the Roman god of war. Jupiter is the largest planet and has a Great Red Spot. Saturn has rings made of ice and dust. Uranus is tipped on its side. Neptune is stormy with faint rings and
The document outlines a 5 day Solar System Project for 3rd grade students. It aims to teach students about the size and characteristics of the Sun, Moon, Earth and planets, the causes of day and night, and the seasons. Each day would include a lecture utilizing websites and materials, followed by activities like worksheets, quizzes, and a field trip. Students would be assessed through multiple choice tests and a computer lab test at the end.
LUBY PUREE KING user manual better than joyoung soymilk makerez-kitchen
This user manual provides instructions for operating the Luby Puree King multi-functional whole grain processor. It includes specifications for the model LBH-10CP1, an exploded parts list, quick operating guides for making purees and smoothies, warming functions, and preset timer settings. The manual describes proper use and provides troubleshooting for issues like overflowing or food not blending. It warns of hot surfaces and instructs users to clean parts thoroughly. The warranty information is also included, covering the product for one year against defects.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive function. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
The solar system consists of the Sun and objects that orbit it, including 8 planets. The Sun is by far the largest object, containing 99.86% of the mass of the entire system. The planets range from Mercury, the smallest and closest to the Sun, to Neptune, the farthest. Other objects in the solar system include dwarf planets like Pluto, asteroids that orbit the Sun between Mars and Jupiter, meteoroids that become meteors as they burn up in Earth's atmosphere, and comets that have highly elliptical orbits around the Sun.
El documento describe diferentes tipos de compresores de aire y sus componentes. Incluye compresores de émbolo, de tornillo, radiales y turbo compresores. También describe elementos auxiliaes como uniones de tuberías, acumuladores, separadores de agua y gráficos sobre caudal y humedad en el aire comprimido. Finalmente, presenta elementos de trabajo neumáticos como cilindros de simple efecto y de membrana.
Demystifying Digital Scholarship: Session 1, McMaster UniversityPaige Morgan
Slides from the first Demystifying Digital Scholarship workshop at the Sherman Centre for Digital Scholarship at McMaster University. (A potentially useful presentation for anyone wanting to learn more about digital scholarship/digital humanities)
This document provides an overview of digital humanities (DH) from Paige Morgan at the Sherman Centre for Digital Scholarship. It defines DH in various ways and notes its values include being adaptive, sustainable, multimodal, interdisciplinary, and collaborative. Most DH projects involve sources that are processed and presented for specific audiences. While DH comes in many forms, its goals generally center around using digital tools to explore available materials in new ways. The document encourages experimenting with DH and provides resources for further training and collaboration.
Demystifying Digital Scholarship: Using Social Media for Learning and Profess...Paige Morgan
This document discusses using social media for learning and professional development in academia. It explores how academics interact online, how to prepare for participating in online conversations, and which platforms are best suited for sharing work or conversing. The document encourages experimenting on conversing platforms like Twitter to improve communication skills before applying those skills to more permanent sharing platforms. It provides tips for building connections and maintaining an online presence through low-commitment activities like following lists of interest and participating in weekly chats.
This document discusses the differences between digital humanities and multimodal scholarship. It notes that digital humanities involves using digital tools to produce scholarship, while multimodal scholarship uses tools to display and disseminate traditional scholarship. It advises that how a project is presented could impact funding opportunities, and that one should consider audience perspectives on definitions. It also provides tips for managing a digital humanities project as a graduate student.
DMDH HASTAC 2015 Presentation: Building and Sustaining DH Communities Paige Morgan
Presentation by Paige Morgan and Brian Gutierrez at HASTAC 2015 on the subject of building DH community and the Demystifying Digital Humanities curriculum.
The document discusses important considerations for choosing digital tools for data visualization projects, including licensing, ownership, platform support, intended audience, flexibility, and robustness. It provides examples of free and paid tools for data visualization, mapping, project display and management, including ManyEyes, Google Maps, Google Earth, ArcGIS, MIT Simile widgets, Scalar, and Pivotal Tracker. It emphasizes that high quality metadata and understanding the components of one's data are important for enabling reuse and interoperability with other tools.
Digital humanities (DH) involves using digital technologies and computational methods for humanities research and teaching. Definitions of DH vary but commonly reference its origins in the 1940s, its evolving and multimodal nature, and tensions with traditional humanities. Practitioners describe DH as a method for humanistic inquiry or a term of convenience. While DH involves digital tools and building things, some argue its focus is the people who identify as digital humanists. Core DH values include collaboration, openness and experimentation. DH projects aim to make humanities research and data more accessible while building communities of practitioners.
This document provides guidance on using social media to develop an online professional identity as an academic. It discusses that professionalization involves communication and that an academic's value extends beyond just publications. It recommends starting with Twitter due to its flexibility and supportive community. The document discusses using Twitter to discover what others are doing, learn through conversations, and find new content. It addresses that participating in online discussions helps one become more aware of their own privilege and issues of marginalization in academia. Overall, the document emphasizes that developing an online professional identity is an active process of balancing sharing information and engaging in conversations.
Feb.2016 Demystifying Digital Humanities - Workshop 2Paige Morgan
Slides from Demystifying Digital Humanities Workshop 2: Data Wrangling: Exploring Programming in Digital Scholarship -- taught at the University of Miami Libraries in February, 2016
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersIvo Andreev
Thank you for the overview of Florence and vision capabilities. Large foundational models continue advancing multimodal abilities in helpful ways when guided by principles of safety, transparency and accountability.
This document summarizes a talk given to researchers and those interested in the semantic web. It discusses how Plone can help researchers by automating the creation of prototypes and demos using code generators. It also advocates exposing research data on the semantic web by making it available as XML or RDF files that can be accessed and used by other applications. Examples are given of applications built with Plone that export or import research data to demonstrate how data can be shared and reused.
How Does Generative AI Actually Work? (a quick semi-technical introduction to...ssuser4edc93
This document provides a technical introduction to large language models (LLMs). It explains that LLMs are based on simple probabilities derived from their massive training corpora, containing trillions of examples. The document then discusses several key aspects of how LLMs work, including that they function as a form of "lossy text compression" by encoding patterns and relationships in their training data. It also outlines some of the key elements in the architecture and training of the most advanced LLMs, such as GPT-4, focusing on their huge scale, transformer architecture, and use of reinforcement learning from human feedback.
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAnant Corporation
This document provides an agenda for a full-day bootcamp on large language models (LLMs) like GPT-3. The bootcamp will cover fundamentals of machine learning and neural networks, the transformer architecture, how LLMs work, and popular LLMs beyond ChatGPT. The agenda includes sessions on LLM strategy and theory, design patterns for LLMs, no-code/code stacks for LLMs, and building a custom chatbot with an LLM and your own data.
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...Daniel Zivkovic
Serverless Toronto's 6th-anniversary event helps IT pros understand and prepare for the #GenAI tsunami ahead. You'll gain situational awareness of the LLM Landscape, receive condensed insights, and actionable advice about RAG in 2024 from Google AI Lead Mark Ryan and LlamaIndex creator Jerry Liu. We chose #RAG (Retrieval-Augmented Generation) because it is the predominant paradigm for building #LLM (Large Language Model) applications in enterprises today - and that's where the jobs will be shifting. Here is the recording: https://youtu.be/P5xd1ZjD-Os?si=iq8xibj5pJsJ62oW
This document provides an overview of getting started with data science using Python. It discusses what data science is, why it is in high demand, and the typical skills and backgrounds of data scientists. It then covers popular Python libraries for data science like NumPy, Pandas, Scikit-Learn, TensorFlow, and Keras. Common data science steps are outlined including data gathering, preparation, exploration, model building, validation, and deployment. Example applications and case studies are discussed along with resources for learning including podcasts, websites, communities, books, and TV shows.
The document discusses the history and key concepts of C++, including its creation by Bjarne Stroustrup, the influences on its development, and its combination of efficiency from C with ease of use from object-oriented programming. It explains the three main traits of object-oriented programming - encapsulation, polymorphism, and inheritance - and provides examples of each concept. The document also provides guidance for new C++ programmers on using header files, the main function, and input/output statements in their first C++ program.
C++ was created by Bjarne Stroustrup and combines elements of C and Simula67. It supports both low-level efficiency and high-level coding through object-oriented programming principles like encapsulation, polymorphism, and inheritance. C++ programs organize code around data and define types that specify which operations can be performed on that type of data.
The document discusses Semantic Web technologies including XML, DOM, RDF, and ontologies. It provides an overview of how these layers work together, from the basic levels of Unicode and URIs, to XML which enables data sharing and transport, to RDF triplets that represent relationships between resources, to ontologies that define classes and connect related items, and finally to higher levels of logic, digital signatures, and trust. The goal of the Semantic Web is to make data on the web more intelligible to computers and enable more sophisticated question answering about relationships between different entities.
Introduction to Multimodal Language models with LLaVA. What are Multimodal models, how do they work, the LLaVA papers/models, and Image classification experiment.
Introduction to Multimodal Language models with LLaVA. What are Multimodal models, how do they work, the LLaVA papers/models, and Image classification experiment.
More information, visit: http://www.godatadriven.com/accelerator.html
Data scientists aren’t a nice-to-have anymore, they are a must-have. Businesses of all sizes are scooping up this new breed of engineering professional. But how do you find the right one for your business?
The Data Science Accelerator Program is a one year program, delivered in Amsterdam by world-class industry practitioners. It provides your aspiring data scientists with intensive on- and off-site instruction, access to an extensive network of speakers and mentors and coaching.
The Data Science Accelerator Program helps you assess and radically develop the skills of your data science staff or recruits.
Our goal is to deliver you excellent data scientists that help you become a data driven enterprise.
The right tools
We teach your organisation the proven data science tools.
The right hands
We are trusted by many industry leading partners.
The right experience
We've done big data and data science at many clients, we know what the real world is like.
The right experts
We have a world class selection of lecturers that you will be working with.
Vincent D. Warmerdam
Jonathan Samoocha
Ivo Everts
Rogier van der Geer
Ron van Weverwijk
Giovanni Lanzani
The right curriculum
We meet twice a month. Once for a lecture, once for a hackathon.
Lectures
The RStudio stack.
The art of simulation.
The iPython stack.
Linear modelling.
Operations research.
Nonlinear modelling.
Clustering & ensemble methods.
Natural language processing.
Time series.
Visualisation.
Scaling to big data.
Advanced topics.
Hackathons
Scrape and mine the internet.
Solving multiarmed bandit problems.
Webdev with flask and pandas as a backend.
Build an automation script for linear models.
Build a heuristic tsp solver.
Code review your automation for nonlinear models.
Build a method that outperforms random forests.
Build a markov chain to generate song lyrics.
Predict an optimal portfolio for the stock market.
Create an interactive d3 app with backend.
Start up a spark cluster with large s3 data.
You pick!
Interested?
Ping us here. signal@godatadriven.com
The Guide to becoming a full stack developer in 2018Amit Ashwini
This document provides a guide for becoming a full-stack developer in 2018. It outlines 8 key skills needed: 1) HTML/CSS, 2) JavaScript, 3) a back-end language like Node.js, Ruby, Python, or PHP, 4) databases and web storage, 5) HTTP and REST, 6) web application architecture, 7) Git, and 8) basic algorithms and data structures. For each skill, it provides details on important concepts and tools to learn. The goal is to learn both front-end skills like HTML/CSS and back-end skills like databases, APIs, and server-side programming in order to build complete web applications.
Python is the most widely used programming language in the world due to its simple syntax, wide platform support, and ease of use. It can be learned by both professionals and students. A survey by Stack Overflow found incredible growth in the number of visitors to Python questions on the site. Lisp is one of the oldest high-level programming languages still in use, known for its extensive use of parentheses in code. It was influential in the development of artificial intelligence. R is a programming language and software environment for statistical analysis and graphics. It provides many statistical and graphical techniques and is highly extensible.
This document discusses big data analytics tools for non-technical users. It introduces Tuktu, a platform that makes big data science accessible through a visual drag-and-drop interface. It also describes using deep learning models trained on linguistic resources to perform natural language tasks across languages with less effort. Finally, it presents CEMistry, a customer experience monitoring product that analyzes text, web, mobile, and backend data to build customer profiles.
Similar to DMDS Winter 2015 Workshop 1 slides (20)
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxEduSkills OECD
Iván Bornacelly, Policy Analyst at the OECD Centre for Skills, OECD, presents at the webinar 'Tackling job market gaps with a skills-first approach' on 12 June 2024
Andreas Schleicher presents PISA 2022 Volume III - Creative Thinking - 18 Jun...EduSkills OECD
Andreas Schleicher, Director of Education and Skills at the OECD presents at the launch of PISA 2022 Volume III - Creative Minds, Creative Schools on 18 June 2024.
This presentation was provided by Racquel Jemison, Ph.D., Christina MacLaughlin, Ph.D., and Paulomi Majumder. Ph.D., all of the American Chemical Society, for the second session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session Two: 'Expanding Pathways to Publishing Careers,' was held June 13, 2024.
A Visual Guide to 1 Samuel | A Tale of Two HeartsSteve Thomason
These slides walk through the story of 1 Samuel. Samuel is the last judge of Israel. The people reject God and want a king. Saul is anointed as the first king, but he is not a good king. David, the shepherd boy is anointed and Saul is envious of him. David shows honor while Saul continues to self destruct.
Temple of Asclepius in Thrace. Excavation resultsKrassimira Luka
The temple and the sanctuary around were dedicated to Asklepios Zmidrenus. This name has been known since 1875 when an inscription dedicated to him was discovered in Rome. The inscription is dated in 227 AD and was left by soldiers originating from the city of Philippopolis (modern Plovdiv).
🔥🔥🔥🔥🔥🔥🔥🔥🔥
إضغ بين إيديكم من أقوى الملازم التي صممتها
ملزمة تشريح الجهاز الهيكلي (نظري 3)
💀💀💀💀💀💀💀💀💀💀
تتميز هذهِ الملزمة بعِدة مُميزات :
1- مُترجمة ترجمة تُناسب جميع المستويات
2- تحتوي على 78 رسم توضيحي لكل كلمة موجودة بالملزمة (لكل كلمة !!!!)
#فهم_ماكو_درخ
3- دقة الكتابة والصور عالية جداً جداً جداً
4- هُنالك بعض المعلومات تم توضيحها بشكل تفصيلي جداً (تُعتبر لدى الطالب أو الطالبة بإنها معلومات مُبهمة ومع ذلك تم توضيح هذهِ المعلومات المُبهمة بشكل تفصيلي جداً
5- الملزمة تشرح نفسها ب نفسها بس تكلك تعال اقراني
6- تحتوي الملزمة في اول سلايد على خارطة تتضمن جميع تفرُعات معلومات الجهاز الهيكلي المذكورة في هذهِ الملزمة
واخيراً هذهِ الملزمة حلالٌ عليكم وإتمنى منكم إن تدعولي بالخير والصحة والعافية فقط
كل التوفيق زملائي وزميلاتي ، زميلكم محمد الذهبي 💊💊
🔥🔥🔥🔥🔥🔥🔥🔥🔥
This presentation was provided by Rebecca Benner, Ph.D., of the American Society of Anesthesiologists, for the second session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session Two: 'Expanding Pathways to Publishing Careers,' was held June 13, 2024.
5. There will always be new
programs and platforms
that you will want to
experiment with.
6. Working with technology
means periodically starting
from scratch -- a bit like
working with a new time
period or culture; or figuring
out how to teach a new
class.
7.
8. Being able to effectively
communicate about your
project as it relates to
programming is a skill in
itself.
10. Programming languages
can...• search for things
• match things
• read things
• write things
• receive information, and give it
back, changed or unchanged
• count things
• do math
• arrange things in quantitative or
random order
• respond: if x, do y OR do x until
y happens
• compare things for similarity
• go to a file at a location, and
retrieve readable text
• display things according to
instructions that you provide
• draw points, lines, and shapes
11. They can also do many or
all of these things in
combination.
12. Example #1
• find all the statements in quotes ("") from a
novel.
• count how many words are in each statement
• put the statements in order from smallest
amount of words to largest
• write all the statements from the novel in a
text file
13. Example #2
• allow a user to type in some information, i.e.,
"Benedict Cumberbatch"
• compare “Benedict Cumberbatch” to a much
larger file
• retrieve any data that matches the
information
• print the retrieved information on screen
14. Example #3
• "read" two texts -- say, two plays by Seneca
• search for any words that the two plays have in
common
• print the words that they have in common on
screen
• calculate what percentage of the words in each
play are shared
• print that percentage onscreen
15. Example #4
• if the user is located in geographic
location Z, i.e., 45th and University, go
to an online address and retrieve some
text
• print that text on the user’s tablet
screen
• receive input from the user and respond
16. However...
• In Example #1, the computer is focusing on
things that characters say. But what if you want
to isolate speeches from just one character?
• In Example 2, how does the computer know
how much text to print? Will it just print
"Benedict Cumberbatch" 379 times, because
that's how often it appears in the larger file?
17. These are the areas of
programming where
critical thinking and
specialized disciplinary
knowledge become vital.
18. The Difference
• Humans are good at differentiating
between material in complex and
sophisticated ways.
• Computers are good at not
differentiating between material unless
they’ve been specifically instructed to
do so.
19. Computers work with
data.
You work with data, too --
but you may have to do
extra work to make your
data readable by
computer.
20. Ways to make your data
machine-readable• Annotate it with markup language
• Organize it in patterns that the
computer can understand
• Add metadata that is not explicitly
readable in the current format (i.e.,
hardbound/softbound binding;
language:English; date of record
creation)
21. Depending on the data
you have, and the way
you annotate or structure
it, different things become
possible.
22. Your goal is to make the
data As Simple As
Possible -- but not so
simple that it stops being
useful.
23. Depending on the data
you work with, the work of
structuring or annotating
becomes more
challenging, but also
more useful.
25. Many programming languages
have governing bodies that
establish standards for their
use:
• the World Wide Web (W3C)
Consortium
(http://www.w3.org/standards/)
• the TEI Technical Council
26. Data Examples
• Annotated (Markup Languages: HTML,
TEI)
• Structured (MySQL)
• Combination (Linked Open Data)
• Object-Oriented Programming (Java,
Python, Ruby on Rails)
29. Markup: HTML
Anything can be data -- and markup
languages provide instructions for how
computers should treat that data.
30. Markup: HTML
HTML is used to format text on webpages.
<p> separates text into paragraphs.
<em> makes text bold (emphasized).
These are just a few of the HTML formatting instructions
that you can use.
31. HTML Syntax Rules
• Open and closed tags: <> and </>
• Attributes (2nd-level information)
defined using =“”
37. Poetry w/ TEI
<text xmlns="http://www.tei-c.org/ns/1.0" xml:id="d1">
<body xml:id="d2">
<div1 type="book" xml:id="d3">
<head>Songs of Innocence</head>
<pb n="4"/>
<div2 type="poem" xml:id="d4">
<head>Introduction</head>
<lg type="stanza">
<l>Piping down the valleys wild, </l>
<l>Piping songs of pleasant glee, </l>
<l>On a cloud I saw a child, </l>
<l>And he laughing said to me: </l>
</lg>
39. TEI’s syntax rules are
identical to HTML’s --
though your normal
browser can’t work with
TEI the way it works with
HTML.
40. TEI is meant to be a
highly social language
that anyone can use and
adapt for new purposes.
41. In order for TEI to
successfully encode texts,
it has to be adaptable to
individual projects.
42. Anything that you can isolate
(and put in brackets) can
(theoretically) be pulled out and
displayed for a reader.
43. TEI can be used to encode more than just text:
<div type="shot">
<view>BBC World symbol</view>
<sp>
<speaker>Voice Over</speaker>
<p>Monty Python's Flying Circus tonight comes to you live
from the Grillomat Snack Bar, Paignton.</p>
</sp>
</div>
<div type="shot">
<view>Interior of a nasty snack bar. Customers around, preferably
real people. Linkman sitting at one of the plastic tables.</view>
<sp>
<speaker>Linkman</speaker>
<p>Hello to you live from the Grillomat Snack Bar.</p>
</sp>
</div>
44. Or, you could encode all
Stephenie Meyer’s
Twilight according to its
emotional register.
45. Whether you include or
exclude some aspect of
the text in your markup
can be very important
from an academic
perspective.
46. The challenge of creating
good data is one reason
that collaboration is so
important to digital
scholarship.
47. Wise Data Collaboration
• Avoid reinventing the wheel (has
someone else already created an
effective method for working with this
data?)
• Consider the labor involved vs. the
outcome (and future use of the data you
create.)
49. Study Scenario #1
• You study urban espresso stands: their
hours, brands of coffee, whether or not
they sell pastries, and how far the
espresso stands are from major
roadways.
50. Study Scenario #2
• You study female characters in novels
written between 1700 and 1850.
Encoding a whole novel just to study
female characters isn’t practical for you.
52. Structured Data: Example
#1
(MySQL)ID Name Location Hours Coffee Brand Pastries (Y/N) Distance from
Street
008 Java the Hut 56
Farringdon
Road,
London, UK
7:00 a.m.-
2:00 p.m.
Square Mile
Roasters
N 25 meters
009 Prufrock
Coffee
18
Shoreditch
High Street
7:00 a.m. –
10:00 p.m.
Monmouth Y 10 meters
56. What’s an “object”?
• An object is a structure that contains data in
one or more forms.
• Common forms include strings, integers, and
arrays (groups of data).
• Example (handout)
57. Object-oriented programming, cont’d
• Learning a bit about an OOP language can
help you become accustomed to working
with programming
• Reading OOP code can also be useful
• Many free tutorials are available
• Goal: to be able to converse more effectively
with professional programmers, rather than
become an expert yourself.
58. How your data is
structured will influence
the technology that you
(can) use to work with it.
63. Every project has data.
Text objects, images, tags, geographical
coordinates, categories, records, creator
metadata, etc.
64. Even if you’re not planning to
learn any programming skills,
you are still working with data.
65. Next time:
Programming on the Whiteboard
February 19th, 3:00-5:00 p.m., Sherman
Centre
• Cleaning data before you work with it!
• Identifying specific programming tasks
• How access affects your project idea
• Flash project development
• Homework: bring some data to work
with.