Keynote at 6th Estonian Digital Humanities Conference in Tartu.
If you wish to have the comments and text for the slides, please email me at the address given in the last slide.
Slides from my lecture for the Information Retrieval and Data Mining course at University College London
The slides cover introductory concepts on topic models, vector semantics and basic end applications
Presentation of my bachelor thesis Information Science. It provides an overview of my attempt to use parsimonious language models on parliamentary proceedings to derive characteristic words for left-wing and right-wing parties, and compare the occurences of these words in subtitles of programmes broadcasted by Dutch public broadcasting organizations.
Kick-off meeting on February 24th 2017 for the Linkflows project, a collaboration between the Web & Media Sciences Group, Computer Science Department, Vrije Universiteit Amsterdam, IOS Press and Netherlands Institute for Sound and Vision.
Dhn2018-A Study on Word2Vec on a Historical Swedish Newspaper CorpusNina Tahmasebi
This document summarizes a study on using Word2Vec to analyze word embeddings over time in a historical Swedish newspaper corpus from 1749-1925. The study tracks how the meanings of 11 words change over time by training Word2Vec models on the corpus yearly and comparing the resulting word vectors. Preliminary results show word vectors become more stable as word frequency increases, and that on average words share meanings with 2-3 other words each year. Examples of shifting meanings for words like "woman", "politics", and "happy" are provided in Swedish. Future work involves handling OCR errors and finding diachronic word replacements and sense-based embeddings to better capture word sense changes over time.
Neural Text Embeddings for Information Retrieval (WSDM 2017)Bhaskar Mitra
The document describes a tutorial on using neural networks for information retrieval. It discusses an agenda for the tutorial that includes fundamentals of IR, word embeddings, using word embeddings for IR, deep neural networks, and applications of neural networks to IR problems. It provides context on the increasing use of neural methods in IR applications and research.
Text analytics and R - Open Question: is it a good match?Marina Santini
http://www.forum.santini.se
* The Quest: finding the optimal way to handle Big Textual Data for Information Discovery
* The Question: is R convenient for text analytics of Big TEXTUAL Data?
* Mission: identification of pros, cons, limits, benefits …
Current Status: investigation in progress…
The document discusses cultural text mining workflows and methods. It describes extracting subsets of digitized newspaper text, stripping metadata, and uploading to Voyant for analysis. Topic modeling is used to identify context-specific words by creating word and topic distributions from texts. Interesting keywords from topic modeling outputs can then be used for further exploration of word frequencies and changes over time in historical corpora. The goal is to discover new research questions through exploratory analysis of digitized cultural texts.
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...Europeana
This document discusses multilingualism in digital cultural heritage. It begins by outlining some of the challenges of multilingual access, including mismatches between user queries and content languages, heterogeneity in queries, and issues with translating metadata. It then discusses some options for bridging the language gap, such as translating queries, content, and metadata; enriching metadata; and adapting systems to better support multilingual exploration. While progress has been made, areas that still need work include improving machine translation for small languages and specialized domains, evaluating solutions, and developing multilingual entity graphs to aid exploration.
Slides from my lecture for the Information Retrieval and Data Mining course at University College London
The slides cover introductory concepts on topic models, vector semantics and basic end applications
Presentation of my bachelor thesis Information Science. It provides an overview of my attempt to use parsimonious language models on parliamentary proceedings to derive characteristic words for left-wing and right-wing parties, and compare the occurences of these words in subtitles of programmes broadcasted by Dutch public broadcasting organizations.
Kick-off meeting on February 24th 2017 for the Linkflows project, a collaboration between the Web & Media Sciences Group, Computer Science Department, Vrije Universiteit Amsterdam, IOS Press and Netherlands Institute for Sound and Vision.
Dhn2018-A Study on Word2Vec on a Historical Swedish Newspaper CorpusNina Tahmasebi
This document summarizes a study on using Word2Vec to analyze word embeddings over time in a historical Swedish newspaper corpus from 1749-1925. The study tracks how the meanings of 11 words change over time by training Word2Vec models on the corpus yearly and comparing the resulting word vectors. Preliminary results show word vectors become more stable as word frequency increases, and that on average words share meanings with 2-3 other words each year. Examples of shifting meanings for words like "woman", "politics", and "happy" are provided in Swedish. Future work involves handling OCR errors and finding diachronic word replacements and sense-based embeddings to better capture word sense changes over time.
Neural Text Embeddings for Information Retrieval (WSDM 2017)Bhaskar Mitra
The document describes a tutorial on using neural networks for information retrieval. It discusses an agenda for the tutorial that includes fundamentals of IR, word embeddings, using word embeddings for IR, deep neural networks, and applications of neural networks to IR problems. It provides context on the increasing use of neural methods in IR applications and research.
Text analytics and R - Open Question: is it a good match?Marina Santini
http://www.forum.santini.se
* The Quest: finding the optimal way to handle Big Textual Data for Information Discovery
* The Question: is R convenient for text analytics of Big TEXTUAL Data?
* Mission: identification of pros, cons, limits, benefits …
Current Status: investigation in progress…
The document discusses cultural text mining workflows and methods. It describes extracting subsets of digitized newspaper text, stripping metadata, and uploading to Voyant for analysis. Topic modeling is used to identify context-specific words by creating word and topic distributions from texts. Interesting keywords from topic modeling outputs can then be used for further exploration of word frequencies and changes over time in historical corpora. The goal is to discover new research questions through exploratory analysis of digitized cultural texts.
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...Europeana
This document discusses multilingualism in digital cultural heritage. It begins by outlining some of the challenges of multilingual access, including mismatches between user queries and content languages, heterogeneity in queries, and issues with translating metadata. It then discusses some options for bridging the language gap, such as translating queries, content, and metadata; enriching metadata; and adapting systems to better support multilingual exploration. While progress has been made, areas that still need work include improving machine translation for small languages and specialized domains, evaluating solutions, and developing multilingual entity graphs to aid exploration.
Professor Andrew Dillon's presentation "Perspectives on the evidence, value and impact of LIS research: conceptual challenges" at the LIS Research Coalition conference, British Library Conference Centre, London 28 June 2010: http://lisresearch.org/conference-2010/, hashtag #lisrc10
EssayWriting Writing Introductions And Conclusions Teaching ResourErica Wright
The document discusses the use of drones by the FBI and other government agencies for surveillance purposes inside the United States. It notes that while drones can help stop terror plots, some people are worried about privacy issues with government spying. The document also references other articles about the use of drones in agriculture to monitor crops and the potential for drone delivery services in the future. Overall, the document examines both the benefits of drone surveillance for security purposes and some of the privacy concerns raised by government use of drones domestically.
Timo Honkela: Digital Preservation and Computational Modeling of Language and...Timo Honkela
A presentation in the symposium “Interfaces between Language, Literature and Culture:
Research at Department of Modern Languages” at University of Helsinki, 19th of May, 2014
This document discusses the emerging field of bibliographic data science. It provides an overview of large bibliographic datasets from national libraries and introduces research done by the Helsinki Computational History Group using these datasets. The group has developed methods to clean, standardize, and link bibliographic metadata to study topics like publishers, authors, languages, and physical book dimensions over time. Their goal is to build an open ecosystem for bibliographic data science to enable new historical research through data analysis.
Deep learning for natural language embeddingsRoelof Pieters
This document discusses approaches to understanding natural language through deep learning techniques. It begins by outlining some of the challenges of language understanding, such as ambiguity and productivity. It then discusses using neural networks for natural language processing tasks like language modeling, sentiment analysis and machine translation. Recurrent and recursive neural networks are presented as approaches to model the compositionality of language. Different methods for obtaining word embeddings like Word2Vec, GloVe and earlier distributional semantic models are also summarized.
"Data is the new water in the digital age"
Anthony (Tony) Nolan OAM, Anthony (Tony) Nolan OAM, Lead Data Scientist, G3N1U5 Pty Ltd, presented a summary of his research as part of the SMART Seminar Series on 6 June 2016.
For more information, visit the event page at: http://smart.uow.edu.au/events/UOW214302.html.
Rolf Hapel presented on rethinking the public library. He discussed trends like declining populations, urbanization, and the rise of digital resources. This requires reinventing library services through new partnerships, digital offerings like the Danish Digital Library, and reimagining physical spaces. The library must meet changing needs through collaboration, co-creation with users, and integrating services like citizens' help. Lessons include thinking of libraries as places for prototyping, advocacy, and answering problems in society through knowledge, ideas and inspiration.
Student Introduction to National History Day in OhioMegan Wood
This document provides an introduction to National History Day (NHD), which allows students to research historical topics of their choosing. It explains that NHD aims to teach research and analytical skills over rote memorization of dates. Students can work individually or in groups to research topics related to an annual theme, analyze their significance, and present their findings through various project formats. Guidance is offered on choosing topics, conducting research, developing a thesis statement, and participating in NHD competitions at various levels.
Timo Honkela: Artificial Intelligence and Machine Learning in the Service of ...Timo Honkela
The document discusses artificial intelligence and machine learning and their potential applications. It notes that AI is being used in many services from large tech companies and was originally developed in universities. It also discusses different AI paradigms and gives examples of research from the University of Helsinki applying machine learning to tasks like pattern recognition, sentiment analysis, machine translation and more. The document expresses optimism that AI can be developed and applied to benefit society by addressing challenges in healthcare, transportation, education and more, while also needing to consider ethical concerns.
The document discusses machine learning approaches for natural language processing. It provides an overview of various NLP tasks like tagging, parsing, information extraction that can be addressed using machine learning techniques. Both deductive and inductive approaches are described for developing models from linguistic data and applying them to solve problems in language processing and understanding.
Big Data in Economic Research: Twitter, Phone calls and Political eventsPhDSofiaUniversity
This document summarizes research using various types of big data including call detail records, political event data, and Twitter data. It discusses how call detail records have been used to study the spread of diseases, optimize transportation networks, and track population displacement. Political event data and mining data have been combined to examine how minerals fuel local conflicts in Africa. Twitter data has been analyzed to map language distributions in Europe and study international migration patterns, such as tracking the migration of Venezuelans during an economic crisis.
This document summarizes research on the digital literacy practices of newly arrived Syrian refugees. It begins with conceptions of literacy as social practices that vary across contexts and cultures. Current research shows refugees often rely on digital tools like smartphones for integration needs. The document then presents three data samples from the researcher's study: an interview with a refugee using Google Translate, observation of a refugee using a livestreaming app to practice English, and analysis of a Syrian community Facebook group. It concludes that refugees can be expert users of technologies for language learning and integration, and that social media allows refugees to both engage with and produce language in meaningful ways to support their daily lives.
This document discusses the relationship between language, culture, and software. It covers several key points:
1) Language and culture are closely intertwined, with language representing the dominant sign of any given culture. Different languages reflect different worldviews.
2) Cultural variables like date formats, number formatting, and colors must be taken into account for software to be successfully localized for other cultures. Internationalization involves developing a cultural model and using international variables.
3) Localization requires adapting aspects of software like graphics, text directionality, and shortcuts to be appropriate for a given language and culture. This helps ensure software usability and understanding across cultures.
Held at the 2nd European Summer School "Cultures & Technologies" (ESU-CT) in Leipzig, Germany, on July 28th, 2010. Thanks to everyone at the summer school for their feedback and many interesting discussions!
On Languages and Sharing (open data), Eliana Trinaistic & Veronica CosteaEliana Trinaistic
This document discusses how open data can benefit the language industry. It begins with background on open data and examples of organizations working with open data. It then discusses challenges social purpose organizations in the language industry face in measuring their impact and how open data sharing can help with strategic clarity, effective advocacy, efficient planning and measurable outcomes. The document concludes by discussing the importance of collaboration and building trust to effectively use open data and advocating for language rights.
Character-based Neural Embeddings for Tweet ClusteringSvitlana Vakulenko
Paper: https://arxiv.org/abs/1703.05123
We show how the performance of tweet clustering can be improved by leveraging character-based neural networks. The proposed approach overcomes the limitations related to the vocabulary explosion in the word-based models and allows for the seamless processing of the multilingual content. Our evaluation results and code are available on-line at https://github.com/vendi12/tweet2vec_clustering
Presented at SocialNLP workshop @EACL2017 3 April 2017
In this talk, I will discuss how we can use digital methods to generate sustainable knowledge in the humanities. I will give an overview of the data-intensive research methodology and discuss how methods, results, and data relate to each other and must be evaluated as parts of a whole: there is no such thing as a good method, nor is there a way to know if the results are good, without considering the data. I will discuss results as a window from which we can see our data, and how we can reason about the results of digital methods. I will also present the Change is Key! research program and describe our efforts to connect computational research with research questions from the humanities and social sciences.
In this talk, I will how we can use digital methods to generate sustainable knowledge in the humanities. I will give an overview of the data-intensive research methodology and discuss how methods, results, and data relate to each other and must be evaluated as parts of a whole: there is no such thing as a good method, nor is there a way to know if the results are good, without considering the data. I will discuss results as a window from which we can see our data, and how we can reason about the results of digital methods. Finally, I will present the Change is Key! research program and describe our efforts to connect computational research with research questions from the humanities and social sciences.
More Related Content
Similar to Detecting language change for the digital humanities
Professor Andrew Dillon's presentation "Perspectives on the evidence, value and impact of LIS research: conceptual challenges" at the LIS Research Coalition conference, British Library Conference Centre, London 28 June 2010: http://lisresearch.org/conference-2010/, hashtag #lisrc10
EssayWriting Writing Introductions And Conclusions Teaching ResourErica Wright
The document discusses the use of drones by the FBI and other government agencies for surveillance purposes inside the United States. It notes that while drones can help stop terror plots, some people are worried about privacy issues with government spying. The document also references other articles about the use of drones in agriculture to monitor crops and the potential for drone delivery services in the future. Overall, the document examines both the benefits of drone surveillance for security purposes and some of the privacy concerns raised by government use of drones domestically.
Timo Honkela: Digital Preservation and Computational Modeling of Language and...Timo Honkela
A presentation in the symposium “Interfaces between Language, Literature and Culture:
Research at Department of Modern Languages” at University of Helsinki, 19th of May, 2014
This document discusses the emerging field of bibliographic data science. It provides an overview of large bibliographic datasets from national libraries and introduces research done by the Helsinki Computational History Group using these datasets. The group has developed methods to clean, standardize, and link bibliographic metadata to study topics like publishers, authors, languages, and physical book dimensions over time. Their goal is to build an open ecosystem for bibliographic data science to enable new historical research through data analysis.
Deep learning for natural language embeddingsRoelof Pieters
This document discusses approaches to understanding natural language through deep learning techniques. It begins by outlining some of the challenges of language understanding, such as ambiguity and productivity. It then discusses using neural networks for natural language processing tasks like language modeling, sentiment analysis and machine translation. Recurrent and recursive neural networks are presented as approaches to model the compositionality of language. Different methods for obtaining word embeddings like Word2Vec, GloVe and earlier distributional semantic models are also summarized.
"Data is the new water in the digital age"
Anthony (Tony) Nolan OAM, Anthony (Tony) Nolan OAM, Lead Data Scientist, G3N1U5 Pty Ltd, presented a summary of his research as part of the SMART Seminar Series on 6 June 2016.
For more information, visit the event page at: http://smart.uow.edu.au/events/UOW214302.html.
Rolf Hapel presented on rethinking the public library. He discussed trends like declining populations, urbanization, and the rise of digital resources. This requires reinventing library services through new partnerships, digital offerings like the Danish Digital Library, and reimagining physical spaces. The library must meet changing needs through collaboration, co-creation with users, and integrating services like citizens' help. Lessons include thinking of libraries as places for prototyping, advocacy, and answering problems in society through knowledge, ideas and inspiration.
Student Introduction to National History Day in OhioMegan Wood
This document provides an introduction to National History Day (NHD), which allows students to research historical topics of their choosing. It explains that NHD aims to teach research and analytical skills over rote memorization of dates. Students can work individually or in groups to research topics related to an annual theme, analyze their significance, and present their findings through various project formats. Guidance is offered on choosing topics, conducting research, developing a thesis statement, and participating in NHD competitions at various levels.
Timo Honkela: Artificial Intelligence and Machine Learning in the Service of ...Timo Honkela
The document discusses artificial intelligence and machine learning and their potential applications. It notes that AI is being used in many services from large tech companies and was originally developed in universities. It also discusses different AI paradigms and gives examples of research from the University of Helsinki applying machine learning to tasks like pattern recognition, sentiment analysis, machine translation and more. The document expresses optimism that AI can be developed and applied to benefit society by addressing challenges in healthcare, transportation, education and more, while also needing to consider ethical concerns.
The document discusses machine learning approaches for natural language processing. It provides an overview of various NLP tasks like tagging, parsing, information extraction that can be addressed using machine learning techniques. Both deductive and inductive approaches are described for developing models from linguistic data and applying them to solve problems in language processing and understanding.
Big Data in Economic Research: Twitter, Phone calls and Political eventsPhDSofiaUniversity
This document summarizes research using various types of big data including call detail records, political event data, and Twitter data. It discusses how call detail records have been used to study the spread of diseases, optimize transportation networks, and track population displacement. Political event data and mining data have been combined to examine how minerals fuel local conflicts in Africa. Twitter data has been analyzed to map language distributions in Europe and study international migration patterns, such as tracking the migration of Venezuelans during an economic crisis.
This document summarizes research on the digital literacy practices of newly arrived Syrian refugees. It begins with conceptions of literacy as social practices that vary across contexts and cultures. Current research shows refugees often rely on digital tools like smartphones for integration needs. The document then presents three data samples from the researcher's study: an interview with a refugee using Google Translate, observation of a refugee using a livestreaming app to practice English, and analysis of a Syrian community Facebook group. It concludes that refugees can be expert users of technologies for language learning and integration, and that social media allows refugees to both engage with and produce language in meaningful ways to support their daily lives.
This document discusses the relationship between language, culture, and software. It covers several key points:
1) Language and culture are closely intertwined, with language representing the dominant sign of any given culture. Different languages reflect different worldviews.
2) Cultural variables like date formats, number formatting, and colors must be taken into account for software to be successfully localized for other cultures. Internationalization involves developing a cultural model and using international variables.
3) Localization requires adapting aspects of software like graphics, text directionality, and shortcuts to be appropriate for a given language and culture. This helps ensure software usability and understanding across cultures.
Held at the 2nd European Summer School "Cultures & Technologies" (ESU-CT) in Leipzig, Germany, on July 28th, 2010. Thanks to everyone at the summer school for their feedback and many interesting discussions!
On Languages and Sharing (open data), Eliana Trinaistic & Veronica CosteaEliana Trinaistic
This document discusses how open data can benefit the language industry. It begins with background on open data and examples of organizations working with open data. It then discusses challenges social purpose organizations in the language industry face in measuring their impact and how open data sharing can help with strategic clarity, effective advocacy, efficient planning and measurable outcomes. The document concludes by discussing the importance of collaboration and building trust to effectively use open data and advocating for language rights.
Character-based Neural Embeddings for Tweet ClusteringSvitlana Vakulenko
Paper: https://arxiv.org/abs/1703.05123
We show how the performance of tweet clustering can be improved by leveraging character-based neural networks. The proposed approach overcomes the limitations related to the vocabulary explosion in the word-based models and allows for the seamless processing of the multilingual content. Our evaluation results and code are available on-line at https://github.com/vendi12/tweet2vec_clustering
Presented at SocialNLP workshop @EACL2017 3 April 2017
Similar to Detecting language change for the digital humanities (20)
In this talk, I will discuss how we can use digital methods to generate sustainable knowledge in the humanities. I will give an overview of the data-intensive research methodology and discuss how methods, results, and data relate to each other and must be evaluated as parts of a whole: there is no such thing as a good method, nor is there a way to know if the results are good, without considering the data. I will discuss results as a window from which we can see our data, and how we can reason about the results of digital methods. I will also present the Change is Key! research program and describe our efforts to connect computational research with research questions from the humanities and social sciences.
In this talk, I will how we can use digital methods to generate sustainable knowledge in the humanities. I will give an overview of the data-intensive research methodology and discuss how methods, results, and data relate to each other and must be evaluated as parts of a whole: there is no such thing as a good method, nor is there a way to know if the results are good, without considering the data. I will discuss results as a window from which we can see our data, and how we can reason about the results of digital methods. Finally, I will present the Change is Key! research program and describe our efforts to connect computational research with research questions from the humanities and social sciences.
In this talk, we will present the Change is Key! program, a 6-year research program where we combine methods for semantic change and lexical variation to answer research questions stemming from humanities and social sciences. We will first introduce different classes of methods for computationally detecting semantic change, ranging from topic modelling to contextual embeddings, and discuss how the results should be valued and evaluated. The talk will further shed light on research questions from the humanities and social science focus domains that will be tackled in the Change is Key! program
An introduction to lexical semantic change by Nina Tahmasebi at Online Workshop on Automatic Detection of Semantic Change in Stuttgart, October 2020
https://www.ims.uni-stuttgart.de/en/institute/news/event/Online-Workshop-on-Automatic-Detection-of-Semantic-Change/
https://languagechange.org/
This document summarizes Nina Tahmasebi's presentation on the strengths and pitfalls of large-scale text mining for literary studies. It discusses 1) views on digital text and how text mining approaches treat text, 2) a data-intensive research methodology involving hypotheses, text mining methods, and results interpretation, and 3) how results should be interpreted in relation to the research question, text, and method used. The presentation emphasizes that the combination of data, methods, and research questions must be evaluated together for valid conclusions.
Workshop on Digital Literacy - Digital text and data-intensive researchNina Tahmasebi
How does Digital Text relate to written non-digital text? What do we need to think about when using digital large-scale methods and interpreting the results.
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
BREEDING METHODS FOR DISEASE RESISTANCE.pptxRASHMI M G
Plant breeding for disease resistance is a strategy to reduce crop losses caused by disease. Plants have an innate immune system that allows them to recognize pathogens and provide resistance. However, breeding for long-lasting resistance often involves combining multiple resistance genes
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
Nucleophilic Addition of carbonyl compounds.pptxSSR02
Nucleophilic addition is the most important reaction of carbonyls. Not just aldehydes and ketones, but also carboxylic acid derivatives in general.
Carbonyls undergo addition reactions with a large range of nucleophiles.
Comparing the relative basicity of the nucleophile and the product is extremely helpful in determining how reversible the addition reaction is. Reactions with Grignards and hydrides are irreversible. Reactions with weak bases like halides and carboxylates generally don’t happen.
Electronic effects (inductive effects, electron donation) have a large impact on reactivity.
Large groups adjacent to the carbonyl will slow the rate of reaction.
Neutral nucleophiles can also add to carbonyls, although their additions are generally slower and more reversible. Acid catalysis is sometimes employed to increase the rate of addition.
Detecting language change for the digital humanities
1. Detecting language change
for the digital humanities;
challenges and opportunities
Nina Tahmasebi, PhD
University of Gothenburg
6th Estonian Digital Humanities conference
2. Digital humanities
(Postdoc at CDH)
Mathematics
(B.Sc &
M.Sc.)
Me
Electrical
Engineering
Computer science
(Phd + Postdoc)
NLP /
Language
Technology
(Researcher)
13. LiWA – Living Web Archives
preparing for
evolution aware
access support
dealing with
terminology
evolution
Semantic &
Terminology Evolution
Noise and Spam
Filtering
Improved Capturing
Existing Web Archiving Technology
Temporal
Coherence
14. What
is new?
time
Increasing amount of historical texts
in digital format
Easy digital access for anyone!
Not only scholars.
Possibility to digitally analyze
historical documents
at large scale.
Information from primary sources
Not only modern interpretations. Text-based
Digital Humanities
29. Aims
To find word sense changes
automatically by
To find what changes, how it
changed and when it changed
Stone
Music
Lifestyle
Rock
1
Modeling word
senses
2
Comparing these
over time
30. 20132008 201220102009 2011 2014 2015 2016 2017 2018
Tahmasebi et al.
2008
Single-sense
Sense-differentiated
Sagi et al
2009
Gulordava
& Baroni
2011
Tang et al
2013
Kim et al
2014
Kulkarni et al
2015
Hamilton et al
Eger and Mehler
Rodda et al
Basile et al
2016
Azarbonyad et al
Takamura et al
Kahnmann & Heyer
Bamler & Mandt
2017
Yao et a,
Rudolph & Blei
2018
Wijaya & Yentizerzi
2011
Lau et al
2012
Cook et al
2013
Cook et al
Mitra et al
2014
Mitra et al
2015
Frerman & Lapata
Tang et al
2016
Tahmasebi & Risse
2017
Costin-Gabriel
& Rebedea
Tjong Kim Sang
2016
embeddings
dynamic embeddings
neural embeddings
topic models
word sense induction
Mihalcea & Nastase
2012
31. topic models
word sense induction
20132008 201220102009 2011 2014 2015 2016 2017 2018
embeddings
dynamic embeddings
neural embeddings
34. Downsides
Random in
• Initialization
• Order in which the training examples
are seen
100 Million tokens per time span*
Typically learn one vector per word
Stable/less dominant senses get lost!
Stone
Music
Lifestyle
Rock
36. Our study
-
2000 000
4000 000
6000 000
8000 000
10000 000
12000 000
14000 000
16000 000
1749
1757
1779
1787
1795
1803
1811
1819
1827
1835
1843
1851
1859
1867
1875
1883
1891
1899
1907
1915
1923
Numberoftokens
Year
Size of Kubhist in tokens
tokens
* https://spraakbanken.gu.se/korp/?mode=kubhist
Word2Vec (W2V)
a two-layer neural net
(skip-gram)
KubHist*
Swedish Newspapers
1749-1925
Trained yearly vectors
37. What did we do?
11 (10) words over time
nyhet 'news'
tidning 'newspaper'
politik 'politics'
telefon 'telephone'
telegraf 'telegraph'
kvinna 'woman'
man 'man'
glad, 'happy'
retorik ‘rhetoric'
resa 'travel'
musik 'music'
A = {happy, smiling, glad}
B = {happy, joyful, cheerful, excited}
Overlap = 1
Unique = 3+4-1 = 6
Jaccard similarity = 1/6
40. Result summary
Avg. Jaccard similarity, normalized
frequency and Spearman correltion
The more frequent the term,
the more stable the vectors
0.11-0.19 overlap
between years
2-3 words in
common each year
59. I like the room but not the sheets.
I like the room but not the sheets. (after stop word filtering)
I like the room but not the sheet. (after lemmatization)
I like the room but not the sheet. (only nouns)
I like the room but not the sheet. (frequency filtering)
I like the room but not the sheet. (only verbs)