Georg Rehm, Felix Sasaki, and Aljoscha Burchardt. Web Annotations - A Game Changer for Language Technologies? I Annotate 2016, Berlin, Germany, May 2016. May 19/20, 2016.
Turning software for cnc lathe machinesShehbaz Mulla
This document summarizes CAM software for programming CNC lathe machines. It describes features like automatic and interactive recognition of features on solid models or imported files. It also covers capabilities for 2-axis and 4-axis lathes like toolpaths for front and rear turrets, stock simulation, toolpath verification, and post-processing support for sub-spindles. Functions are included for roughing, grooving, finishing, threading, cutoff cycles and single point tools.
This document provides an overview of Pundit, an open source web annotation tool. Pundit allows users to annotate web pages, text fragments, and images. It includes Pundit Annotator for basic annotations and comments, and Pundit Annotator Pro for semantic annotations using Linked Open Data. Pundit also includes an Annotation Manager for centralized annotation management and an Annotation Server that stores annotations using the W3C Web Annotation Data Model. Pundit aims to enable collaborative annotation and knowledge sharing through a decentralized and crowdsourced approach.
Presentation of context: Web Annotations (& Pundit) during the StoM Project (...Net7
This is one of the presentations used for the StoM project final review (http://www.stom-project.eu/). It aims at presenting the state of the art for Web Annotation and how the evolutions in this area that happened in the last two years have been taken into account to improve Net7's Semantic Annotation System Pundit (http://thepund.it/).
Maruti Gollapudi has over 17 years of experience as a principal architect, specializing in digital customer experience. Some of his significant contributions include developing a data aggregation and analytics platform hosted on AWS that enables capabilities like social analytics, text analytics using NLP and machine learning, and enterprise search. He has experience building solutions leveraging technologies such as Java, JBoss, Kafka, MongoDB, Solr, Watson, and various analytics and social APIs. Recent projects include developing a headless CMS for page building and dynamic content modification for CNBC, and architecting a middleware for CNBC's integration with Uber to dynamically serve ride-related content.
The document discusses changes in technical communication and information development over the past 10 years from the perspective of a practitioner. It describes shifts from technical writer to technical communicator to information developer. It also outlines changes in focus from content development to user experience, delivery methods, and ensuring content is consumable. Examples are provided from past and recent SIGDOC proceedings that illustrate these changes.
Turning software for cnc lathe machinesShehbaz Mulla
This document summarizes CAM software for programming CNC lathe machines. It describes features like automatic and interactive recognition of features on solid models or imported files. It also covers capabilities for 2-axis and 4-axis lathes like toolpaths for front and rear turrets, stock simulation, toolpath verification, and post-processing support for sub-spindles. Functions are included for roughing, grooving, finishing, threading, cutoff cycles and single point tools.
This document provides an overview of Pundit, an open source web annotation tool. Pundit allows users to annotate web pages, text fragments, and images. It includes Pundit Annotator for basic annotations and comments, and Pundit Annotator Pro for semantic annotations using Linked Open Data. Pundit also includes an Annotation Manager for centralized annotation management and an Annotation Server that stores annotations using the W3C Web Annotation Data Model. Pundit aims to enable collaborative annotation and knowledge sharing through a decentralized and crowdsourced approach.
Presentation of context: Web Annotations (& Pundit) during the StoM Project (...Net7
This is one of the presentations used for the StoM project final review (http://www.stom-project.eu/). It aims at presenting the state of the art for Web Annotation and how the evolutions in this area that happened in the last two years have been taken into account to improve Net7's Semantic Annotation System Pundit (http://thepund.it/).
Maruti Gollapudi has over 17 years of experience as a principal architect, specializing in digital customer experience. Some of his significant contributions include developing a data aggregation and analytics platform hosted on AWS that enables capabilities like social analytics, text analytics using NLP and machine learning, and enterprise search. He has experience building solutions leveraging technologies such as Java, JBoss, Kafka, MongoDB, Solr, Watson, and various analytics and social APIs. Recent projects include developing a headless CMS for page building and dynamic content modification for CNBC, and architecting a middleware for CNBC's integration with Uber to dynamically serve ride-related content.
The document discusses changes in technical communication and information development over the past 10 years from the perspective of a practitioner. It describes shifts from technical writer to technical communicator to information developer. It also outlines changes in focus from content development to user experience, delivery methods, and ensuring content is consumable. Examples are provided from past and recent SIGDOC proceedings that illustrate these changes.
This document contains the resume of Jay J. Rawal, who has over 10 years of experience as a web developer. He currently works as a Senior Information Specialist at Franklin Templeton Investments in Mumbai, where he is responsible for developing and maintaining technology solutions to improve access to investment information. His skills include web design, database design, project management, and analytics tools like Tableau. He has experience with technologies such as PHP, HTML, CSS, and SharePoint.
The document discusses semantic web technology, which aims to make information on the web better understood by machines by giving data well-defined meaning. It outlines the evolution of web technologies from the initial web to the semantic web. Key aspects of semantic web technology include ontologies to define common vocabularies, semantic annotations to associate meaning with data, and reasoning capabilities to enable complex queries and analyses. Languages, tools, and applications are needed to implement these semantic web standards and make the web of linked data usable.
Nikhil Bagde has a Master's degree in Computer Science from Binghamton University and a Bachelor's degree in Computer Engineering from Pune University in India. He has over 2 years of experience as a Software Engineer developing web applications using Java, J2EE, MySQL, JQuery and CSS. His technical skills include programming languages like Java, C/C++, Python, and technologies like Struts, Hibernate, Spring, MySQL, Oracle, Linux and IDEs like Eclipse and IntelliJ. He has worked on projects involving recommender systems, decision trees, natural language processing and multi-threaded applications. Nikhil also has leadership experience organizing technical events and doing community service.
The document discusses the evolution and need for web engineering. It provides background on the history of web development, from static HTML pages to dynamic content management systems. It then covers the characteristics of web applications, including different types of users, tasks, technologies used, and contextual factors. The document argues that the continuous change of requirements, competitive pressures, and fast pace of development necessitate an engineering approach and ongoing evolution of web applications.
SEMANACCO is a web application that allows accommodation providers to easily generate semantically annotated accommodation descriptions using vocabularies like Schema.org and the Accommodation Ontology. It was developed to facilitate search engine optimization by enabling users to create "Rich Snippets" for search results. The application uses technologies like HTML5, Java, and Google Web Toolkit for high performance. It focuses on accommodations data and allows users to select different annotation vocabularies, and to export, save, and load descriptions. Further development is still needed on features like the export function and usability testing.
The document summarizes a student project on speech recognition using Python. It includes 4 literature review papers on topics related to speech recognition, natural language processing, and machine learning approaches. It also includes a problem statement, methodology, comparisons table of the papers, conclusions, and proposes future work such as integrating speech APIs and creating a mobile app. The project uses Python and Tkinter to create a GUI-based speech recognition system that converts speech to text and vice versa.
Olinda Turner is a content developer and program manager with over 20 years of experience in content strategy, development, and management. She has a proven track record of leading teams, developing content and user experiences, and managing complex projects across multiple teams and technologies. Her technical skills include content management systems, web technologies, and Microsoft Office Suite. She holds a B.A. in Mathematics and Business Administration from the University of San Diego and certificates in Content Strategy and Web Technology from Northwestern University and University of Washington.
The document discusses modelling and exchanging annotations for Europeana projects. It proposes adopting the W3C Web Annotation Data Model to represent annotations in RDF using JSON-LD serialization. An Annotations API based on the W3C Web Annotation Protocol allows exchanging annotations between Europeana and platforms like HistoryPin.org and Pundit. Representing metadata annotations is also discussed to make them machine-readable and shareable across interfaces. Overall, modelling annotations interoperably and exchanging them across platforms is still a work in progress.
Language Resources for Multilingual EuropeGeorg Rehm
META-NET has received funding from the EU to support several language technology projects, including CRACKER, T4ME, CESAR, METANET4U, and META-NORD. It brings together over 60 research centers across 34 countries to build infrastructure for sharing language resources and tools. The goal is to improve the visibility, documentation, identification, availability, and interoperability of language resources in order to support both academic and commercial language technology research and development across Europe.
This document provides an introduction to web technologies courses. It defines key terms like the Internet and the World Wide Web. It outlines the history and growth of the web from 1995 to present day. It also describes the focus of the course which is web development technologies including protocols, architectures, languages, and methods/tools. The document lists prerequisites and provides an overview of course contents, exams, and references.
Cs8092 computer graphics and multimedia unit 5SIMONTHOMAS S
This document discusses multimedia authoring tools and techniques. It covers several topics:
1. Types of multimedia authoring tools including card/page based tools, icon based tools, and time based tools. Popular examples are discussed.
2. Key features and capabilities of authoring tools including editing, programming, interactivity, playback, delivery, and project organization.
3. Authoring system metaphors like hierarchical, flow control, and different technologies focused on like hypermedia.
4. Considerations for multimedia production, presentation, and automatic authoring. Professional development tools are also outlined.
An Introduction to Semantic Web TechnologyAnkur Biswas
The document provides an overview of the semantic web and some of its key challenges. It discusses:
1) The evolution of the world wide web from a web of documents to a web of linked data through technologies like RDF, OWL, and SPARQL that add semantic meaning.
2) The vision for the semantic web is to publish machine-readable data using common formats so that information can be automatically processed by agents and integrated across sources.
3) Some challenges in realizing this vision include dealing with implicit knowledge, heterogeneous data distributions, and maintaining links and correctness over time as data changes.
Explore the transformative power of full stack development and its profound implications for shaping the landscape of modern applications. Delve into the synergy of front-end and back-end prowess, and discover how this approach revolutionizes user experiences, functionality, and the very core of software design. Join us on a journey through the evolution of development, where full stack prowess emerges as a driving force behind the applications of tomorrow.
ITAC 2016 Where Open Source Meets Audit AnalyticsAndrew Clark
Open source software is taking the computer science community and IT departments by storm. The breadth of options, the timeliness of updates, the price, and the sense of community are all contributing factors to the rise of open source computing. For many years audit analytics has been confined to the Computer Assisted Auditing Techniques, CAAT, software vendors ACL, IDEA and now Arbutus. However, these software programs require extensive training to use effectively, are not very flexible, and in most cases fail to provide the outcome auditors are expecting. Moving to an open source platform based around the python ecosystem allows for true customization of analytics, and provides a common language to interact with your IT department. By using the same set of tools, an auditing department can move from rudimentary AP duplicate tests all the way to advanced classification and clustering machine learning tests. Although the barrier to entry for open source software is higher than for most CAATs, with cross-functional collaboration, a truly customized, sustainable, and highly effective analytics program can be created.
This document provides an overview of a one-day knowledge seminar on Web 3.0, semantics, and enterprise computing hosted by Canopus Consulting. The seminar will cover topics like representing domain knowledge using ontologies, schemaless databases, and architectural challenges of large-scale content repositories. It includes sessions on the evolution of the semantic web, building semantic applications, user experience design for Web 3.0, and applying semantic computing techniques in enterprises. The speakers are experts from Canopus Consulting and other organizations in areas like cultural informatics, semantic and collaborative computing.
Learn web development: Front-end vs Back-end developmentpuneetbatra24
Good web development implies an optimized website, which is essential to get important visitors from search engines. Hidden drivers of that growth are web development companies that create websites and mobile apps, improving online sales and making the lives of customers simpler. If you want to learn more about web development, Visit here:- https://www.up2mark.com/web-development
No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts...ACTUONDA
No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts of RTBF TITAN
Primer encuentro BIG MEDIA
Conectando Media, Audiencia y Publicidad con Datos
24 de junio 2014, Madrid
• Sponsor Platinum : Perfect Memory
• Sponsor Gold : Stratio, Paradigma
• Con el apoyo de : Big Data Spain, Medios On
• Socio tecnológico : Agora News
• Organizadores : Actuonda y Cátedra Big Data UAM-IBM
• Contacto : Nicolas Moulard (Actuonda) moulard@actuonda.com @Radio_20
www.bigmediaconnect.es
LocServ - presentation of great localization and internationalization servicesLocServ
This document provides an overview of consulting services for internationalization, localization, and localization management. It describes assessments to analyze technical requirements, costs, and localization readiness. It also outlines services for internationalization development and testing, software and website localization and translation, and localization testing. The goal is to help clients expand their product or service into global markets.
Vladimir Alexiev presented ResearchSpace, a virtual research environment (VRE) based on the CIDOC CRM ontology. ResearchSpace aims to provide tools and services to support collaborative research projects for cultural heritage scholars. It aggregates data from various sources using semantic technologies and the CIDOC CRM ontology, allows semantic search of the data based on fundamental relations, and includes features for data analysis, collaboration, and web publication. The presentation provided an overview of Ontotext, the company developing ResearchSpace, described some of ResearchSpace's key capabilities, and discussed how the CIDOC CRM is central to ResearchSpace's approach.
QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...Georg Rehm
Georg Rehm. QURATOR: Developing a Flexible AI Platform for Digital Content Curation. QURATOR 2020 – Conference on Digital Curation Technologies., 1 2020. Fraunhofer FOKUS, January 20/21, 2020. Invited keynote talk.
Observations on Annotations – From Computational Linguistics and the World Wi...Georg Rehm
Georg Rehm. Observations on Annotations – From Computational Linguistics and the World Wide Web to Artificial Intelligence and back again. Annotation in Scholarly Editions and Research: Function – Differentiation – Systematization, University of Wuppertal, Germany. February 20-22, 2019. Invited keynote talk.
More Related Content
Similar to Web Annotations – A Game Changer for Language Technology?
This document contains the resume of Jay J. Rawal, who has over 10 years of experience as a web developer. He currently works as a Senior Information Specialist at Franklin Templeton Investments in Mumbai, where he is responsible for developing and maintaining technology solutions to improve access to investment information. His skills include web design, database design, project management, and analytics tools like Tableau. He has experience with technologies such as PHP, HTML, CSS, and SharePoint.
The document discusses semantic web technology, which aims to make information on the web better understood by machines by giving data well-defined meaning. It outlines the evolution of web technologies from the initial web to the semantic web. Key aspects of semantic web technology include ontologies to define common vocabularies, semantic annotations to associate meaning with data, and reasoning capabilities to enable complex queries and analyses. Languages, tools, and applications are needed to implement these semantic web standards and make the web of linked data usable.
Nikhil Bagde has a Master's degree in Computer Science from Binghamton University and a Bachelor's degree in Computer Engineering from Pune University in India. He has over 2 years of experience as a Software Engineer developing web applications using Java, J2EE, MySQL, JQuery and CSS. His technical skills include programming languages like Java, C/C++, Python, and technologies like Struts, Hibernate, Spring, MySQL, Oracle, Linux and IDEs like Eclipse and IntelliJ. He has worked on projects involving recommender systems, decision trees, natural language processing and multi-threaded applications. Nikhil also has leadership experience organizing technical events and doing community service.
The document discusses the evolution and need for web engineering. It provides background on the history of web development, from static HTML pages to dynamic content management systems. It then covers the characteristics of web applications, including different types of users, tasks, technologies used, and contextual factors. The document argues that the continuous change of requirements, competitive pressures, and fast pace of development necessitate an engineering approach and ongoing evolution of web applications.
SEMANACCO is a web application that allows accommodation providers to easily generate semantically annotated accommodation descriptions using vocabularies like Schema.org and the Accommodation Ontology. It was developed to facilitate search engine optimization by enabling users to create "Rich Snippets" for search results. The application uses technologies like HTML5, Java, and Google Web Toolkit for high performance. It focuses on accommodations data and allows users to select different annotation vocabularies, and to export, save, and load descriptions. Further development is still needed on features like the export function and usability testing.
The document summarizes a student project on speech recognition using Python. It includes 4 literature review papers on topics related to speech recognition, natural language processing, and machine learning approaches. It also includes a problem statement, methodology, comparisons table of the papers, conclusions, and proposes future work such as integrating speech APIs and creating a mobile app. The project uses Python and Tkinter to create a GUI-based speech recognition system that converts speech to text and vice versa.
Olinda Turner is a content developer and program manager with over 20 years of experience in content strategy, development, and management. She has a proven track record of leading teams, developing content and user experiences, and managing complex projects across multiple teams and technologies. Her technical skills include content management systems, web technologies, and Microsoft Office Suite. She holds a B.A. in Mathematics and Business Administration from the University of San Diego and certificates in Content Strategy and Web Technology from Northwestern University and University of Washington.
The document discusses modelling and exchanging annotations for Europeana projects. It proposes adopting the W3C Web Annotation Data Model to represent annotations in RDF using JSON-LD serialization. An Annotations API based on the W3C Web Annotation Protocol allows exchanging annotations between Europeana and platforms like HistoryPin.org and Pundit. Representing metadata annotations is also discussed to make them machine-readable and shareable across interfaces. Overall, modelling annotations interoperably and exchanging them across platforms is still a work in progress.
Language Resources for Multilingual EuropeGeorg Rehm
META-NET has received funding from the EU to support several language technology projects, including CRACKER, T4ME, CESAR, METANET4U, and META-NORD. It brings together over 60 research centers across 34 countries to build infrastructure for sharing language resources and tools. The goal is to improve the visibility, documentation, identification, availability, and interoperability of language resources in order to support both academic and commercial language technology research and development across Europe.
This document provides an introduction to web technologies courses. It defines key terms like the Internet and the World Wide Web. It outlines the history and growth of the web from 1995 to present day. It also describes the focus of the course which is web development technologies including protocols, architectures, languages, and methods/tools. The document lists prerequisites and provides an overview of course contents, exams, and references.
Cs8092 computer graphics and multimedia unit 5SIMONTHOMAS S
This document discusses multimedia authoring tools and techniques. It covers several topics:
1. Types of multimedia authoring tools including card/page based tools, icon based tools, and time based tools. Popular examples are discussed.
2. Key features and capabilities of authoring tools including editing, programming, interactivity, playback, delivery, and project organization.
3. Authoring system metaphors like hierarchical, flow control, and different technologies focused on like hypermedia.
4. Considerations for multimedia production, presentation, and automatic authoring. Professional development tools are also outlined.
An Introduction to Semantic Web TechnologyAnkur Biswas
The document provides an overview of the semantic web and some of its key challenges. It discusses:
1) The evolution of the world wide web from a web of documents to a web of linked data through technologies like RDF, OWL, and SPARQL that add semantic meaning.
2) The vision for the semantic web is to publish machine-readable data using common formats so that information can be automatically processed by agents and integrated across sources.
3) Some challenges in realizing this vision include dealing with implicit knowledge, heterogeneous data distributions, and maintaining links and correctness over time as data changes.
Explore the transformative power of full stack development and its profound implications for shaping the landscape of modern applications. Delve into the synergy of front-end and back-end prowess, and discover how this approach revolutionizes user experiences, functionality, and the very core of software design. Join us on a journey through the evolution of development, where full stack prowess emerges as a driving force behind the applications of tomorrow.
ITAC 2016 Where Open Source Meets Audit AnalyticsAndrew Clark
Open source software is taking the computer science community and IT departments by storm. The breadth of options, the timeliness of updates, the price, and the sense of community are all contributing factors to the rise of open source computing. For many years audit analytics has been confined to the Computer Assisted Auditing Techniques, CAAT, software vendors ACL, IDEA and now Arbutus. However, these software programs require extensive training to use effectively, are not very flexible, and in most cases fail to provide the outcome auditors are expecting. Moving to an open source platform based around the python ecosystem allows for true customization of analytics, and provides a common language to interact with your IT department. By using the same set of tools, an auditing department can move from rudimentary AP duplicate tests all the way to advanced classification and clustering machine learning tests. Although the barrier to entry for open source software is higher than for most CAATs, with cross-functional collaboration, a truly customized, sustainable, and highly effective analytics program can be created.
This document provides an overview of a one-day knowledge seminar on Web 3.0, semantics, and enterprise computing hosted by Canopus Consulting. The seminar will cover topics like representing domain knowledge using ontologies, schemaless databases, and architectural challenges of large-scale content repositories. It includes sessions on the evolution of the semantic web, building semantic applications, user experience design for Web 3.0, and applying semantic computing techniques in enterprises. The speakers are experts from Canopus Consulting and other organizations in areas like cultural informatics, semantic and collaborative computing.
Learn web development: Front-end vs Back-end developmentpuneetbatra24
Good web development implies an optimized website, which is essential to get important visitors from search engines. Hidden drivers of that growth are web development companies that create websites and mobile apps, improving online sales and making the lives of customers simpler. If you want to learn more about web development, Visit here:- https://www.up2mark.com/web-development
No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts...ACTUONDA
No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts of RTBF TITAN
Primer encuentro BIG MEDIA
Conectando Media, Audiencia y Publicidad con Datos
24 de junio 2014, Madrid
• Sponsor Platinum : Perfect Memory
• Sponsor Gold : Stratio, Paradigma
• Con el apoyo de : Big Data Spain, Medios On
• Socio tecnológico : Agora News
• Organizadores : Actuonda y Cátedra Big Data UAM-IBM
• Contacto : Nicolas Moulard (Actuonda) moulard@actuonda.com @Radio_20
www.bigmediaconnect.es
LocServ - presentation of great localization and internationalization servicesLocServ
This document provides an overview of consulting services for internationalization, localization, and localization management. It describes assessments to analyze technical requirements, costs, and localization readiness. It also outlines services for internationalization development and testing, software and website localization and translation, and localization testing. The goal is to help clients expand their product or service into global markets.
Vladimir Alexiev presented ResearchSpace, a virtual research environment (VRE) based on the CIDOC CRM ontology. ResearchSpace aims to provide tools and services to support collaborative research projects for cultural heritage scholars. It aggregates data from various sources using semantic technologies and the CIDOC CRM ontology, allows semantic search of the data based on fundamental relations, and includes features for data analysis, collaboration, and web publication. The presentation provided an overview of Ontotext, the company developing ResearchSpace, described some of ResearchSpace's key capabilities, and discussed how the CIDOC CRM is central to ResearchSpace's approach.
Similar to Web Annotations – A Game Changer for Language Technology? (20)
QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...Georg Rehm
Georg Rehm. QURATOR: Developing a Flexible AI Platform for Digital Content Curation. QURATOR 2020 – Conference on Digital Curation Technologies., 1 2020. Fraunhofer FOKUS, January 20/21, 2020. Invited keynote talk.
Observations on Annotations – From Computational Linguistics and the World Wi...Georg Rehm
Georg Rehm. Observations on Annotations – From Computational Linguistics and the World Wide Web to Artificial Intelligence and back again. Annotation in Scholarly Editions and Research: Function – Differentiation – Systematization, University of Wuppertal, Germany. February 20-22, 2019. Invited keynote talk.
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...Georg Rehm
Georg Rehm. The Preparation, Impact and Future of the META-NET White Paper Series “Europe’s Languages in the Digital Age”. Sanskrit and Other Indian Languages Technology (SOIL-Tech), Jawaharlal Nehru University, New Delhi, India, February 2019. February 15, 2019. Invited keynote talk.
AI and Conference Interpretation – From Smart Assistants for the Human Interp...Georg Rehm
Georg Rehm. AI and Conference Interpretation - From Smart Assistants for the Human Interpreter to Automatic Solutions. DG Interpretation Lunchtime Session on Digital Transformation. European Commission, Brussels, November 2018. November 12, 2018. Invited talk.
Künstliche Intelligenz beim Dolmetschen und ÜbersetzenGeorg Rehm
Georg Rehm. Künstliche Intelligenz beim Dolmetschen und Übersetzen. Institut für Angewandte Linguistik und Translatologie, Universität Leipzig, November 2018. November 1, 2018. Invited presentation.
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...Georg Rehm
Georg Rehm. Herausforderungen und Lösungen für die europäische Sprachtechnologie-Forschung und -Entwicklung. Deutsches Forschungszentrum für Künstliche Intelligenz GmbH, Berlin, Germany, October 2018. October 30, 2018. Presentation on the occasion of being awarded the appointment as a DFKI Research Fellow.
European Language Technologies – Past, Present and FutureGeorg Rehm
Georg Rehm. European Language Technologies – Past, Present and Future. Language Equality in the Digital Age. Conference on language technologies and digital equality in a multilingual Europe, European Parliament, Brussels, Belgium, September 2018. September 27, 2018. Invited talk
Towards a Human Language Project for Multilingual Europe: AI and InterpretationGeorg Rehm
Georg Rehm. Towards a Human Language Project for Multilingual Europe: AI and Interpretation. DG Interpretation Conference - Interpretation: Sharing Knowledge & Fostering Communities. European Commission, Brussels, April 2018. April 19/20, 2018. Invited talk.
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) ÜberblickGeorg Rehm
Georg Rehm. KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick. Interdisziplinärer Forschungsverbund Digital Humanities in Berlin (ifDHb), 23. Berliner DH-Rundgang im Deutschen Forschungszentrum für Künstliche Intelligenz, Berlin, Germany, February 05, 2018.
Language Technologies for Multilingual Europe - Towards a Human Language Proj...Georg Rehm
META-NET has received funding from the EU for several projects related to language technologies, most recently the CRACKER project. The document outlines the history and development of META-NET's Strategic Research and Innovation Agenda (SRIA), including versions 0.5, 0.9, and the current version 1.0 beta, which endorses the establishment of a Human Language Project to help overcome language barriers in Europe. A recent survey of over 600 language technology experts found strong support for a large-scale Human Language Project to achieve deep natural language understanding by 2030.
AI for Translation Technologies and Multilingual EuropeGeorg Rehm
Georg Rehm. AI for Translation Technologies and Multilingual Europe. DG TRAD Conference - Translation Services in the Digital World: A Sneak Peek into the (near) Future. Luxembourg. October 16/17, 2017.
Georg Rehm. Kuratieren im Zeitalter der KI. #DKT17 - Kuratieren im Zeitalter der KI, Berlin, Germany, October 2017. October 12, 2017. Invited keynote talk.
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...Georg Rehm
Georg Rehm. Transformieren, Manipulieren, Kuratieren? Technologien für die Wissensarbeit im Netz. KOOP-LITERA International. Konferenz 2017, Berlin, Germany, June 2017. June 20, 2017. Invited talk.
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen BibliothekenGeorg Rehm
Georg Rehm and Clemens Neudecker. Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken . Berliner Bibliothekswissenschaftliches Kolloqium (BBK), Humboldt-Universität zu Berlin, Berlin, Germany, June 2017. June 06, 2017. Invited talk.
Georg Rehm. EPUB, quo vadis? ePublishing im W3C. Jahrestagung der IG Digital. Im Rahmen der Buchtage, Jahreskongress des Börsenvereins, Berlin, Germany, June 2017. June 14, 2017. Invited talk.
Human Language Technologies in a Multilingual EuropeGeorg Rehm
The document summarizes a presentation on human language technologies in a multilingual Europe. Some key points:
- There are 24 official EU languages and many regional/minority languages that have equal status but most are under-supported by language technologies and face digital extinction.
- The META-NET alliance coordinates language technology research across Europe but the field remains fragmented. There is a need for high-quality, deployable language technologies to support applications like translation, conversational interfaces, and a multilingual digital single market.
- A proposed "Multilingual Value Programme" would help enable the multilingual digital single market through technologies for translating, analyzing, processing and curating natural language content.
- A long-term
Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...Georg Rehm
Georg Rehm. Language Technologies for Big Data – A Strategic Agenda for the Multilingual Digital Single Market. BDVA Summit (Big Data Value Association), Valencia, Spain, December 2016. December 1, 2016.
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...Georg Rehm
Georg Rehm. Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda for the Multilingual Digital Single Market. Future and Emerging Trends in Language Technologies, Machine Learning and Big Data (FETLT 2016), Seville, Spain, November 2016. November 30, 2016.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Web Annotations – A Game Changer for Language Technology?
1. Georg Rehm, Felix Sasaki, Aljoscha Burchardt
DFKI GmbH – Language Technology Lab, Berlin
Web Annotations
A Game Changer for Language Technologies?
2. Language Technology
• Language Technology is a heterogeneous and evolving
set of applications that involve the
– (semi-)automatic processing (analysis) or
– (semi-)automatic production
of human language (written or spoken).
• Driven by NLP, CL, Linguistics, CompSci, CogSci, AI.
• Methods operate on language data (often web-scale)
• Rule-based tools, statistics (machine learning)
• Need for human experts to analyse and annotate data
sets with highly specialised linguistic analysis information
Web Annotations and Language Technology – I Annotate 2016 2
3. Selected LT Applications
Spell checking, grammar checking
Search engines (IR)
Interactive personal assistants (Cortana, Siri etc.)
Machine Translation
Recommender systems
Social media (analytics, streams)
Knowledge-based systems
Web Annotations and Language Technology – I Annotate 2016 3
4. Web Annotations and Language Technology – I Annotate 2016
Web Annotation Architecture
Web annotation architecture
http://www.w3.org/annotation
What is the relationship between
Web Annotations
and Language Technology?
4
5. Web Annotations and Language Technology – I Annotate 2016
Web Annotation Architecture
Content could be created by Language
Technology fully automatically or in a
semi-automatic way (text generation).
5
6. Web Annotations and Language Technology – I Annotate 2016
Web Annotation Architecture
Content could be analysed by
Language Technology (semantic
analysis, input for ML algorithms etc.)
6
7. Web Annotations and Language Technology – I Annotate 2016
Web Annotation Architecture
Especially in Social Media Analytics we
are very interested in UGC, i.e., in
comments, feedback – “what do users
think of a certain product?“ etc.
7
8. Web Annotations and Language Technology – I Annotate 2016
Web Annotation Architecture
• Today, analysing UGC is difficult
and costly (many heterogeneous
sources, many different formats).
• A few established and widely used
Web Annotation services would
simplify SMA dramatically!
8
9. Web Annotations and Language Technology – I Annotate 2016
Web Annotation Architecture
We can also use Language Technology
methods to create (or help create)
annotations, for example, in a smart
authoring scenario.
9
10. LT and Web Annotations
• Analysis of web annotations and making use of web
annotations through Language Technology:
– Arbitrary web annotations (i.e., unstructured text)
• No more crawling, aggregating, mapping!
– Dedicated LT-specific web annotations
• Annotating language data without any specialised
stand-alone tools or data repositories!
• Generation of web annotations through Language
Technology (e.g., to provide background information on
important content – see, e.g., the Pundit use cases).
Web Annotations and Language Technology – I Annotate 2016 10
11. Example Scenarios
• Two example scenarios to demonstrate how Language
Technology and Web Annotations go together.
• Scenario 1 – Digital Curation Technologies:
Semantification of content for curators of digital information
• Scenario 2 – Machine Translation:
Web Annotations for High-Quality Machine Translation
Web Annotations and Language Technology – I Annotate 2016 11
12. language and knowledge technologies
curation technologies
sector-specific technologies
platformtechnologies
sector-specific solutions
!
Digital Curation Technologies
• Support curation processes through sophisticated
language and knowledge technologies.
• Goal: transfer of these technologies into industry
through platform for digital curation technologies.
Web Annotations and Language Technology – I Annotate 2016 12
14. Sectors
Input Processes Software Output
tweet analyse text processor newspaper article
newspaper article select presentation multimedia website
wire copy focus spreadsheet tv report
facebook status update revise email exhibition catalogue
search result read up on browser mobile application
email write groupware mashup (e.g., map)
text message create sector-specific application text piece
concept research CMS concept
text file assess ECMS timeline
video evaluate CRM study
map arrange enterprise software presentation
stockphoto sort graphics/layouting software fact collection
in-house database structure IP telephony description of an exhibit
calendar entry summarise etc. analysis
spreadsheet shorten etc.
archive translate
etc. catch up on
combine
abstract
integrate
visualise
generate
annotate
reference
etc.
Information
Information
Information
Information
Information
Information
Information
Information
Information
? ??
?Information
OutputInput SoftwareProcesses
15. Web Annotations and Language Technology – I Annotate 2016
Structure visualisation
Multilingual multimedia sources
Crossmedia recommendations
Multilingual summarisation
Event timelining
Semantification of content
Multilingual sentiment analysis
Semantic story-telling
Ontology-based knowledge structures
15
Curation Processes
16. platform for digital curation technologies
broker REST API
curation service 1
language or knowledge
technology
curation service 2
language or knowledge
technology
client using
the API
external
service 1
external
service 2
client using
the API
client using
the API
client using
the API
pipelined curation workflow
Web Annotations and Language Technology – I Annotate 2016 16
17. platform for digital curation technologies
broker REST API
curation service 1
language or knowledge
technology
curation service 2
language or knowledge
technology
client using
the API
external
service 1
external
service 2
client using
the API
client using
the API
client using
the API
pipelined curation workflow
• Annotation of time expressions – needed for visualisation of time-lining
• Input: text content – output: list of time expressions and mean dates
• Storage using the Web Annotation model
• http://dkt-projekt.github.io/webAnnotation/webannotation-dkt.html
Example
Web Annotations and Language Technology – I Annotate 2016 17
20. Web Annotations for HQMT
• Current MT research workflows use several specialised and
incompatible tools and distributed repositories.
• Ideal scenario: one coherent,
interoperable and integrated
ecosystem of tools.
• Centrally stored web
annotations would be
a massive step in the
right direction!
Web Annotations and Language Technology – I Annotate 2016 20
http://www.cracking-the-language-barrier.eu/mt-eval-workshop-2016/
- Ranking
- Post-Editing
- Error Annotation (MQM)
- Task based Evaluation
Human Evaluation
- Sampling
- Filtering
- Translation Memory Inclusion
- Terminology Checking
Translation Production Workflows
- Tokeinisation
- POS tagging
- Parsing
- Entity recognition
- WSD
Linguistic Analysis
- Services
- Development
Machine Translation
- BLEU
- Quality Estimation
- PE-Distance
- Test-Suites
Automatic Evaluation
REPOSITORY
COCKPIT
BACKEND
DATA SETS
META-SHARE
WMT
JRC
CLARIN
21. Multidimensional Quality Metrics
MQM for MT diagnostics
• Customisable framework for translation quality metrics
• Early version standardised in W3C’s ITS 2.0
21
• Annotations in current workflows are typically
proprietary, tool-, format- and workflow-based.
• Web annotations could enable the creation of a
collaborative corpus of translation data for the
whole community.
• Feedback into MT engines through annotated
web-scale corpora could lead to a boost in
performance and quality.
• Next slide: conversion of proprietary tool format
to Web Annotations.
22. From MQM to Web Annotations
Web Annotation
(intermediate XML syntax)
Proprietary and tool-specific CSV
MQM issue type
https://github.com/dkt-projekt/webAnnotation/tree/gh-pages/mqm-webannotation
23. Web Annotation Infrastructure
• Web annotations themselves work on language.
• Language Technology could help build better services.
• Anchoring annotations to changing content in a
robust way is apparently tricky.
• Semantic methods for identifying the new position of the
original anchors that have changed since the annotation
was put there.
• Annotating all copies of the document that is
currently being annotated – application of methods for
duplicate detection or near duplicate detection.
Web Annotations and Language Technology – I Annotate 2016 23
24. Vision 2020
• Next generation personal assistant.
• Highly personalised, assisted browsing experience.
• Semantic language technologies in the background.
• Detection of the user‘s tasks, intentions, preferences.
• Annotation of relevant, surprising, new facts in current
and future content through web annotations.
• Anticipation of the user’s next steps.
• Suggestion of related content based on
user modelling and semantic story telling.
Web Annotations and Language Technology – I Annotate 2016 24
Georg Rehm and Hans Uszkoreit (eds.). The META-NET Strategic Research Agenda for Multilingual
Europe 2020. Springer, 2013; see Priority Research Theme “Socially-Aware Interactive Assistant”.
25. So, are Web Annotations a game changer
for Language Technology?
Yes, most certainly – if the UX and
browser support are done right.
Maybe Language Technology can also be
a game changer for Web Annotations.
Web Annotations and Language Technology – I Annotate 2016 25
26. Thank you!
Web Annotations and Language Technology – I Annotate 2016 26
supported by supported by
Beyond Multilingual Europe
04/05 July, 2016 – Lisbon, Portugal
http://www.meta-forum.eu
Deadline for submissions: 29 May 2016