CEPT Systems is developing a new natural language processing technology called Semantic Fingerprinting that can dramatically improve how businesses process large amounts of text-based data. Their technology, called the CEPT Retina, converts words and documents into semantic fingerprints that capture relationships between meanings. These fingerprints allow for direct comparison of word and document similarities. CEPT offers their technology as a cloud-based API that is simple for developers to use and integrate into various applications. Their technology has 12 application-specific APIs and is aimed to help businesses with tasks like search, classification, discovery, and analytics using semantic analysis of text. An example success story is an online language learning company that is using the CEPT API to lower costs, improve learner motivation, generate
With IPscreener anyone is able to explore and understand the tech knowledge hidden in patents. By only using a plain text input the semantic AI presents a dashboard of the innovation landscape, identifying similar documents and pointing out relevant paragraphs. Use IPscreener for a smarter way to validating your ideas.
AI-SDV 2020: Bringing AI to SME projects: Addressing customer needs with a fl...Dr. Haxel Consult
Customers interested in Language Analytics solutions typically approach us with a broad range of business cases and specific business needs. Especially when it comes to the data available for their case and for any AI aspects involved, the variation in data types, data quality and data quantity is, by our experience, quite vast and at the same time so critical for a project's success, that we often start our requirements analysis right there: at the data. At Karakun, our Language Analytics team addresses this in an increasingly flexible way: We select from a set of Language Analytics tools and related services (e.g. data cleansing and data procurement) to meet the business needs at hand with the data available or at least in reach – at reasonable costs.
The methodology stack ranges from heuristic logic over statistical solutions to neural networks. At the same time, we aim at reducing the amount of data needed for such training, e.g. by integrating state-of-the-art neural technologies into our platform. That way, also SMEs and their specific business cases can benefit from the full range of Language Analytics options.
To illustrate our approach, we will present an e-Safe solution which allows for semantic document tagging and search in highly secured virtual safes. In addition, our solution provides text-based triggers for complex workflows depending on the safe´s content.
The Meeting
Programme
Networking Programme
Speakers 2018
Call for Papers
Registration
Attendees 2018
Why should you attend
The Venue Hotel
Sponsors
Exhibition
Exhibitors
Search Technology VantagePoint
II-SDV Photos Nice 2017
II-SDV 2017
II-SDV 2016
II-SDV 2015
II-SDV 2014
II-SDV 2013
II-SDV 2012
SPEAKERS SINCE 2012
Keep me informed
concerning the II-SDV meeting
ICIC Website
II-PIC Website
Search Technology VantagePoint
For over 30 years, Search Technology, Inc. has helped our customers turn information into knowledge. We provide software tools and services that extract more value from patent, scientific, technical and business databases. Our primary product, VantagePoint, helps you rapidly understand and navigate through search results, giving you a better perspective - a better vantage point - on your information. Discover why many of today’s Fortune 100 companies use VantagePoint to help them succeed. VantagePoint is Serious Software for Serious Professionals.
IC-SDV 2019: Down-to-earth machine learning: What you always wanted your data...Dr. Haxel Consult
Applications of machine learning on NLP tasks today receive a lot of attention and have been shown to yield state of the art results on a wide range of tasks. We describe several cases where machine learning is deployed productively under the usual constaints of real-world projects: Real-world requirements, fast throughput, reasonably low requirements in terms of training corpus size and high quality results. What we observe is a general trend towards open source - also our components are open source. With the software being mostly freely available, among the key success criteria for many NLP projects today therefore is first and foremost the necessary expertise required to combine, tune and apply open source components.
ICIC 2013 Conference Proceedings Tony Trippe PatinformaticsDr. Haxel Consult
Using IBM's Many Eyes for Generating Valuable Patent Analytics Insights
Tony Trippe (Patinformatics, USA)
PAIR (Patent Application Information Retrieval) is the US Patent and Trademark Office's Registry system for sharing information on the prosecution of patent applications. The information contained in the system can be used to research patent documents to determine their value. In this presentation a tab by tab walk-through of the site will be provided. Specific sections and pieces of data provided on the site will be highlighted and their role in determining the potential value of a patent document will be explained. In addition a means for collecting, organizing and presenting a summary of Public PAIR data, provided by Google, will be discussed.
Minesoft, founded in 1996, develops patent information services and solutions, linking technical, legal, bibliographic and full text patent information for patent professionals and research specialists.
Minesoft's products and services are used by leading corporations, national Patent Offices, patent attorneys and law firms globally.
Minesoft develops fast, effective document delivery solutions, a range of competitive intelligence services including patent alerting and monitoring tools, a patent family & legal status portal delivering the latest data for over 40 countries, a searchable patent database of over 48 million records, patent archiving, advanced viewing tools and more.
In May 2003, Minesoft and RWS announced their agreement to develop PatBase, a new searchable patent database designed by experts in the complex art of search and retrieval of patent information. The first public demonstrations of PatBase in November 2003 received critical acclaim from expert patent searchers for the intuitive and feature-rich interface. PatBase is a comprehensive new search resource for patent and R&D specialists worldwide, covering 100 countries.
Organised into extended patent families, each invention is represented by one individual record, making it easy to see in which countries a filing for patent protection has been made. PatBase is an ideal resource for conducting daily patent & technology searches. Rapid delivery of competitive intelligence, integrated searchable full text, PDF delivery and legal status reports bring PatBase to the forefront of today's searchable patent information tools. PatBase is regularly updated with new data and new search features and is now established as a leading patent resource.
PatBase Express, developed in partnership with RWS and released March 2006, provides an easy-to-use end-user corporate-wide solution. PatBase Express allows the most up-to-date scientific research and competitor information contained in patent documents to be accessed throughout an organisation.
Minesoft products can be customised to individual corporate requirements.
Find more on www.minesoft.com
AI-SDV 2020: Implementation of new technology within a big pharma company: Fi...Dr. Haxel Consult
Pharmaceutical companies have always relied on data to support innovation and drive the business. This concept has remained unchanged, despite the fact that today we get it at the speed of light, in an overwhelming volume, and in a global and mostly unstructured way. In order to continue to derive knowledge and insights from that data to support drug discovery and business strategies, integrating AI tools into work processes and acquiring the required skills to do so has become crucial. Although the need is clear, how to implement these new tools is not straightforward in an era of restructuring, divesting, outsourcing, and budget crunching. This talk will focus on creative ways to overcome some of these barriers within a “big pharma” setting with a specific example demonstrating the application of these concepts.
With IPscreener anyone is able to explore and understand the tech knowledge hidden in patents. By only using a plain text input the semantic AI presents a dashboard of the innovation landscape, identifying similar documents and pointing out relevant paragraphs. Use IPscreener for a smarter way to validating your ideas.
AI-SDV 2020: Bringing AI to SME projects: Addressing customer needs with a fl...Dr. Haxel Consult
Customers interested in Language Analytics solutions typically approach us with a broad range of business cases and specific business needs. Especially when it comes to the data available for their case and for any AI aspects involved, the variation in data types, data quality and data quantity is, by our experience, quite vast and at the same time so critical for a project's success, that we often start our requirements analysis right there: at the data. At Karakun, our Language Analytics team addresses this in an increasingly flexible way: We select from a set of Language Analytics tools and related services (e.g. data cleansing and data procurement) to meet the business needs at hand with the data available or at least in reach – at reasonable costs.
The methodology stack ranges from heuristic logic over statistical solutions to neural networks. At the same time, we aim at reducing the amount of data needed for such training, e.g. by integrating state-of-the-art neural technologies into our platform. That way, also SMEs and their specific business cases can benefit from the full range of Language Analytics options.
To illustrate our approach, we will present an e-Safe solution which allows for semantic document tagging and search in highly secured virtual safes. In addition, our solution provides text-based triggers for complex workflows depending on the safe´s content.
The Meeting
Programme
Networking Programme
Speakers 2018
Call for Papers
Registration
Attendees 2018
Why should you attend
The Venue Hotel
Sponsors
Exhibition
Exhibitors
Search Technology VantagePoint
II-SDV Photos Nice 2017
II-SDV 2017
II-SDV 2016
II-SDV 2015
II-SDV 2014
II-SDV 2013
II-SDV 2012
SPEAKERS SINCE 2012
Keep me informed
concerning the II-SDV meeting
ICIC Website
II-PIC Website
Search Technology VantagePoint
For over 30 years, Search Technology, Inc. has helped our customers turn information into knowledge. We provide software tools and services that extract more value from patent, scientific, technical and business databases. Our primary product, VantagePoint, helps you rapidly understand and navigate through search results, giving you a better perspective - a better vantage point - on your information. Discover why many of today’s Fortune 100 companies use VantagePoint to help them succeed. VantagePoint is Serious Software for Serious Professionals.
IC-SDV 2019: Down-to-earth machine learning: What you always wanted your data...Dr. Haxel Consult
Applications of machine learning on NLP tasks today receive a lot of attention and have been shown to yield state of the art results on a wide range of tasks. We describe several cases where machine learning is deployed productively under the usual constaints of real-world projects: Real-world requirements, fast throughput, reasonably low requirements in terms of training corpus size and high quality results. What we observe is a general trend towards open source - also our components are open source. With the software being mostly freely available, among the key success criteria for many NLP projects today therefore is first and foremost the necessary expertise required to combine, tune and apply open source components.
ICIC 2013 Conference Proceedings Tony Trippe PatinformaticsDr. Haxel Consult
Using IBM's Many Eyes for Generating Valuable Patent Analytics Insights
Tony Trippe (Patinformatics, USA)
PAIR (Patent Application Information Retrieval) is the US Patent and Trademark Office's Registry system for sharing information on the prosecution of patent applications. The information contained in the system can be used to research patent documents to determine their value. In this presentation a tab by tab walk-through of the site will be provided. Specific sections and pieces of data provided on the site will be highlighted and their role in determining the potential value of a patent document will be explained. In addition a means for collecting, organizing and presenting a summary of Public PAIR data, provided by Google, will be discussed.
Minesoft, founded in 1996, develops patent information services and solutions, linking technical, legal, bibliographic and full text patent information for patent professionals and research specialists.
Minesoft's products and services are used by leading corporations, national Patent Offices, patent attorneys and law firms globally.
Minesoft develops fast, effective document delivery solutions, a range of competitive intelligence services including patent alerting and monitoring tools, a patent family & legal status portal delivering the latest data for over 40 countries, a searchable patent database of over 48 million records, patent archiving, advanced viewing tools and more.
In May 2003, Minesoft and RWS announced their agreement to develop PatBase, a new searchable patent database designed by experts in the complex art of search and retrieval of patent information. The first public demonstrations of PatBase in November 2003 received critical acclaim from expert patent searchers for the intuitive and feature-rich interface. PatBase is a comprehensive new search resource for patent and R&D specialists worldwide, covering 100 countries.
Organised into extended patent families, each invention is represented by one individual record, making it easy to see in which countries a filing for patent protection has been made. PatBase is an ideal resource for conducting daily patent & technology searches. Rapid delivery of competitive intelligence, integrated searchable full text, PDF delivery and legal status reports bring PatBase to the forefront of today's searchable patent information tools. PatBase is regularly updated with new data and new search features and is now established as a leading patent resource.
PatBase Express, developed in partnership with RWS and released March 2006, provides an easy-to-use end-user corporate-wide solution. PatBase Express allows the most up-to-date scientific research and competitor information contained in patent documents to be accessed throughout an organisation.
Minesoft products can be customised to individual corporate requirements.
Find more on www.minesoft.com
AI-SDV 2020: Implementation of new technology within a big pharma company: Fi...Dr. Haxel Consult
Pharmaceutical companies have always relied on data to support innovation and drive the business. This concept has remained unchanged, despite the fact that today we get it at the speed of light, in an overwhelming volume, and in a global and mostly unstructured way. In order to continue to derive knowledge and insights from that data to support drug discovery and business strategies, integrating AI tools into work processes and acquiring the required skills to do so has become crucial. Although the need is clear, how to implement these new tools is not straightforward in an era of restructuring, divesting, outsourcing, and budget crunching. This talk will focus on creative ways to overcome some of these barriers within a “big pharma” setting with a specific example demonstrating the application of these concepts.
The EXTRA classifier is a scalable solution based on recent advances in Natural Language Processing (NLP). The foundational concept of the EXTRA classifier is transfer learning, a machine learning process that enables the relatively low-cost specialization of a pre-trained language model to a specific task in a specific domain with far fewer training examples compared to standard machine learning solutions.
More specifically, the EXTRA classifier leverages BERT, a well-known pre-trained autoencoding language model that has revolutionized the NLP space in the past few years. BERT provides contextual embeddings, i.e., it provides context-aware vector representations of words that capture semantics far more efficiently than their context-free counterparts.
The EXTRA classifier contains a pre-processing module to cope with the inevitable noise in the output of standard Optical Character Recognition systems. The pre-processed plain text from a source document is then fed into a BERT-based classifier, which is built by extending pre-trained BERT with an additional linear layer trained for classification through a process commonly known as fine-tuning.
We will present preliminary results that confirm some clear benefits with respect to rule-based solutions in terms of classification performance and system scalability.
AI-SDV 2020: Using Transformer technology to build an AI based personal News ...Dr. Haxel Consult
After having successfully implemented a proof of concept in late 2019, DS9 started with one of its customers to implement a pilot for a Deep Learning based personal News Rating system.
The system was initially trained to rate news matching a limited number of typical user profiles. Over time the system would evolve and learn more and more about personal preferences and build user-specific rating models.
This talk looks at the many challenges we overcame during implementation: collecting training data, integrating many diverse news sources, management of many user-specific models and automatic deployment to production.
AI-SDV 2021 - Tony Trippe - The Current State of Machine Learning for Patent ...Dr. Haxel Consult
The use of machine learning in IP activities has increased exponentially over the past five years. At the same time new tools, methods and systems have begun to emerge that seek to make the analysis of patent data easier to accomplish using these techniques. Included in these new developments are a significant number of machine learning systems that have begun coming to market. As these changes continue to occur, it would be useful to review some of the tools, systems, or methods that a patent practitioner has at their disposal. Examples and perspectives on the latest advances in machine learning for IP will be provided. There will also be a tour of ML4Patents.com which is devoted to aggregating content associated with the development of this area.
A brief overview of Watson Cognitive solutions and an introduction into how they can be incorporated into the oil & gas arena. Dave Haake, Global C&P Solutions Executive, IBM Watson
Founded in 2004 by a group of business & technology entrepreneurs and inventors, Dolcera is a next generation patent and information analytics company. Our proprietary patent search strategy and domain expertise combined with our game changing AI/ML driven tools like PCS and E-Search enable us to offer top notch patent search and analytics services to our diverse clientele in key decision-making areas of IP strategy & creation, litigation, portfolio analysis, competitive intelligence, product development and licensing.
Our AI- driven tools and services include:
Dolcera PCS – A deep-learning driven superfast patent search engine that offers domain specific Semantic and taxonomy-oriented search capabilities for Patent portfolio analysis, Prior Art, Invalidation and licensing searches.
Dolcera Machine learning for Auto-categorization – The transparent, customizable system solves the problem of human bias or comprehension of a patent when classifying them by leveraging proprietary Dolcera AI/ML. The system gives complete control to the user to deliver highly accurate categorized documents with a quick turn around.
Dolcera Enterprise Search (E-Search) – Dolcera Enterprise Search is a machine learning enabled enterprise-wide search engine integrated with various value added features and smart charts. The application uses patent documents, scientific literature, product information or generally other technical documents inside a company. The system lets users have access to all the related files with minimal clicks to generate actionable insights.
Dolcera ETSI Dashboard – A collection of SEP from various technologies and various specifications updated regularly + PCS (AI driven patent search tool) links the SEPs declared under the specific technical standards & visualize the information for.
Our bespoke services include IP research, technical review, business research and newsletter alerts which help our clients make key strategic decisions in relation to:
Choosing the right technology
Filing & protecting the right IP
Standard mapping and claim charting
Product enhancement
IP, technical & business landscapes
Competitive intelligence
Regulatory landscape
The amalgamation of Dolcera’s AI driven tools, IP services & market research capabilities have time and again assisted our clients in managing their IP & non-IP assets systematically, while enhancing their overall decision-making process.
For further information visit: www.dolcera.com
Our Global presence:
USA: California, Chicago
Europe: Germany, UK
Asia: India, China
AI-SDV 2021: Francisco Webber - Efficiency is the New PrecisionDr. Haxel Consult
The global data sphere, consisting of machine data and human data, is growing exponentially reaching the order of zettabytes. In comparison, the processing power of computers has been stagnating for many years. Artificial Intelligence – a newer variant of Machine Learning – bypasses the need to understand a system when modelling it; however, this convenience comes with extremely high energy consumption.
The complexity of language makes statistical Natural Language Understanding (NLU) models particularly energy hungry. Since most of the zettabyte data sphere consists of human data, such as texts or social networks, we face four major obstacles:
1. Findability of Information – when truth is hard to find, fake news rule
2. Von Neumann Gap – when processors cannot process faster, then we need more of them (energy)
3. Stuck in the Average – when statistical models generate a bias toward the majority, innovation has a hard time
4. Privacy – if user profiles are created “passively” on the server side instead of “actively” on the client side, we lose control
The current approach to overcoming these limitations is to use larger and larger data sets on more and more processing nodes for training. AI algorithms should be optimized for efficiency rather than precision. In this case, statistical modelling should be disqualified as a brute force approach for language applications. When replacing statistical modelling and arithmetic, set theory and geometry seem to be a much better choice as it allows the direct processing of words instead of their occurrence counts, which is exactly what the human brain does with language – using only 7 Watts!
ICIC 2017: Technology Scouting: Decision Support in Strategic Analyses for Te...Dr. Haxel Consult
Stefan Geißler (Expert System Deutschland, Germany)
Tim Schloen (Fraunhofer IAO Stuttgart, Germany)
Trying to keep up to date with developments, trends and opportunities even in narrow domains involves digesting and considering large amounts of textual information and often is beyond the reading capabilities of human experts. Large organisations may be able to cope with this challenge with dedicated analysis teams, but SMEs are overcharge with this information load.
Under the label „Technology Scouting“ we present an approach to employ modern semantic technologies and artificial intelligence to (semi) automate the collection, analysis and reporting of large document collections in a way that allows to derive important insights for technology-driven companies: Which technologies and markets are emerging or moving, how do my clients, partners and competitors operate?
After an introduction to the technological and methodological basis, we present experiences from past industry engagements where this aproach has be applied in production.
AILANI is a novel and unique semantic search enterprise solution for fast, easy and comprehensive knowledge discovery. It combines semantic modelling, ontologies, linguistics and artificial intelligence (AI) algorithms in a self-refining system that delivers results based on inter-related meaning of facts. AILANI not just allows for phrase searches as well as structured queries, it offers its users a unique hybrid natural language question answering system combining machine learning algorithms with semantic network-based "prior knowledge" inference. It integrates seamlessly with existing infrastructure and helps leverage knowledge buried both in decade old data as well as data derived from news feeds and clinical trials providing real-time semantic analysis of breaking news. For the pharmaceutical industry it is critical to stay up to date with the latest clinical trials news for decision-making in drug development. Integration of the relevant data and using ontology-based refiners enables fast and efficient retrieval of information about the clinical competitive landscape.
AI-SDV 2021: Linus Wretblad - Best practice on new intelligent tools in IP ma...Dr. Haxel Consult
New tools are indeed present promising opportunities. However, the use of AI also raises new concerns regarding the transparency of the expected usability and reliability. This presentation will elaborate around some the associated questions; the impact of the training data, the black box symptoms, verification of performance and privacy considerations when using AI. But of course also some insights to best practice of AI search tool and how to take control of the process for better efficiency.
Overview of end-to-end lifecycle to productize and commercialize alternative datasets at S&P Global Market Intelligence
Benefits to discuss:
How S&P Market Intelligence develops new alternative datasets
How S&P Market Intelligence develops robust production processes for alternative data
S&P Global Market Intelligence GTM strategy and capabilities to sell alternative data
Biomax Informatics provides services and software solutions for efficient decision making and knowledge management at the intersection of life sciences, healthcare and information technologies. Biomax facilitates digital transformation within biotech, pharma, agriculture, food and chemical industries as well as research institutes.
Biomax offers a range of standard products, based on the core technology knowledge management technology BioXM™, which are synergistically interrelated.
AILANI™, the Artificial Intelligence LANguage Interface, provides unique semantic search capabilities that catalyze digital change and accelerate the innovation cycle.
NeuroXM™ is the one-stop-shop to decipher brain physiology.
The Clinical Integration System ensures access to real world evidence data, which is critical to effectively and robustly train Artificial Intelligence to support clinical decision support at the point of care.
With more than 20 years of experience and around 50 employees - including numerous life scientists, data scientists and software developers with a scientific background - Biomax is a competent partner.
Founded in 1997, Biomax is ISO 9001 and ISO 27001 certified and is headquartered in Planegg near Munich, Germany.
More info @ www.biomax.com
For over 30 years, Search Technology, Inc. has helped our customers turn information into knowledge. We provide software tools and services that extract more value from patent, scientific, technical and business databases. Our primary product, VantagePoint, helps you rapidly understand and navigate through search results, giving you a better perspective - a better vantage point - on your information. Discover why many of today’s Fortune 100 companies use VantagePoint to help them succeed. VantagePoint is Serious Software for Serious Professionals.
Find more on: http://www.TheVantagePoint.com/
AI-SDV 2020: AI-augmented Question Answering and Semantic Search for Life Sci...Dr. Haxel Consult
We have created structure based chemical ontologies that are used to classify chemical compounds automatically. These classifications can be used with success in semantic search engines to find all representatives of a chemical class. In the present paper we would like to demonstrate use cases when utilizing these chemical classes as features in typical machine learning approaches.
Thus, we have used the co-occurrence of chemical compounds with biological and physico-chemical properties in scientific articles to train models that predict properties of novel compounds that did not occur in those training sets. One example is the prediction of hepatotoxicity as well as bioavailability. In principle, one can use any property that is found in the textual vicinity of compounds to build such predictive models. Criteria will be presented that allow to judge the quality and predictive power of such models.
ICIC 2017: The Use of Patent Information for Innovation and Competitive Intel...Dr. Haxel Consult
Jochen Lennhof (Minesoft, Germany)
Patent data is a critical source of information to stimulate innovation and for competitive intelligence. Patents are often the first and only source of disclosure of a new invention and hence, ignoring them will only delay innovation and give an incomplete competitive intelligence picture.
Delivered from the perspective of a patent analyst, we will use case studies to describe the use of patent data to compile a competitive landscape, to stimulate innovation by learning from others and to help identify valuable IP in a portfolio. We will discuss the challenges in using patents for competitive intelligence and the recent innovative features and functionality in PatBase which can help, including:
Using thesauri, semantic and non-patent literature searching to compile a comprehensive competitive landscape.
The use of Analytics for customised, multidimensional analysis and to visually compare multiple datasets.
Text-mining to automatically identify and highlight concepts within any full text patent.
Citation analysis to identify key competitors, collaborators or potential infringers.
This presentation will demonstrate how any user can benefit from the innovative features and functionality in PatBase to interrogate and visualize the competitive landscape for any technical area.
AI-SDV 2021 - Klaus Kater - The secret of successful CI: precise targeting + ...Dr. Haxel Consult
New technologies like CRISPR-CAS or mRNA based vaccines require CI teams to constantly screen the market for opportunities and threats. The need for more focused intelligence that is targeted to the company’s strategic plans and faster access to critical information for decision makers has been increasing dramatically: More opportunities can be turned into a competitive advantage and management can avert potential threats before they become problematic. Monitoring early technology development, finding licensing opportunities or acquisition targets as well as quick access to broad clinical trial information and close surveillance of the competition are key disciplines of CI teams in R&D;
Modelling Customer Lifetime Revenue for Subscription BusinessDatabricks
Customer lifetime Value/Revenue(LTV/R) is the present value of the future profits/revenue from a customer. Estimating it, is important for businesses to optimise the marketing costs in acquiring and retaining the customers. Complex consumer behaviour and innumerable ways a consumer interacts with the business makes things challenging to estimate it. Years of ongoing research in this field has led to the development of various ML tools and techniques. We would like to take this opportunity to walkthrough some of these techniques and their applications in specific business contexts.
Condé Nast is a global media company that produces some of the world’s leading print, digital, video and social brands. These include Vogue, GQ, The New Yorker, Vanity Fair, Wired and Architectural Digest (AD), Condé Nast Traveler and La Cucina Italiana, among others. Subscription revenue is one of the major revenue streams for the organization and we’d like to demonstrate the implementation of LTV/R model for the subscription revenue for one of the brands using survival models and along with that illustrate the following.
Estimate the average lifetime (ALT) of a brand’s subscriber.
Estimate the average lifetime of various segments within the brand and identify the most valuable/least valuable segments, so marketing teams could device appropriate targeting strategies.
Finally attempt to estimate the lifetime at a subscriber level.
Key insights & findings through the analysis.
Demo of sample code
Leveraging databricks delta files for our big data processing needs.
Biomax provides computational solutions for better decision making and knowledge management in the life science industry. Biomax helps customers generate value from proprietary and public resources by extracting the knowledge indispensable for efficient data exploration and interpretation. They focus on integrating information to enable a knowledge-based approach to develop innovative life science products. The company supports its customers with a platform that combines software products with knowledge resources, including oncology, nutrigenomics, plant research and functional genomics. With the launch of the NeuroXM Brain Science Suite in 2018 Biomax offers products tailored for the field of connectome research. The new Semantic Searching Platform AILANI provides a corporate-wide knowledge repository accessible for everyone, any time and from anywhere. Biomax’s worldwide customer community includes companies and research organizations that are successful in the areas of drug discovery, diagnostics, fine chemicals, food and plant production.
Highlighting how Refinery Advisor enhances decision making by oil & gas professionals by applying the organizations collective knowledge to large complex data sets, in context. -Scott Kimbleton, Associate Partner, Chemicals & Petroleum Global Business Services
TechWiseTV Workshop: Improving Performance and Agility with Cisco HyperFlexRobb Boyd
Find out how organizations like yours are deriving business value from the HyperFlex HCI solution. Join us for a deep dive and Q&A at the TechWiseTV workshop.
TechWiseTV Hyperflex 4.0 Episode: http://cs.co/9009EW2Td
Dissemination Patterns of Technical Knowledge in the IR Industry: Scientometric Analysis of Citations in IR-related Patents
Ricardo Eito-Brun (Universidad Carlos III de Madrid, Spain)
The purpose of this paper is to identify the most influential institutions and journals on information retrieval and text mining through the analysis of the citations in the patents issued in the period between 1990 and 2013.
Bibliographic citations received by different academic journals in a representative set of patents related to the text mining area are analyzed applying sound and consolidated statistical techniques. Besides identifying the most influential academic journals, conferences and institutions in the period under study, the conclusions of this research are also useful to identify the most relevant and productive organizations (those with a higher number of patents) and those organizations whose patents have received a major number of citations.
The analysis also permits to obtain a general view of the disseminations patterns in the consumption of the products of academic and technical research. The period under study offers interesting conclusions regarding the impact of the Web on the Intellectual Property Rights (IPR) strategies of companies building Information Retrieval and Text Mining software solutions, and how the information retrieval and access industry has evolved in the recent years.
The EXTRA classifier is a scalable solution based on recent advances in Natural Language Processing (NLP). The foundational concept of the EXTRA classifier is transfer learning, a machine learning process that enables the relatively low-cost specialization of a pre-trained language model to a specific task in a specific domain with far fewer training examples compared to standard machine learning solutions.
More specifically, the EXTRA classifier leverages BERT, a well-known pre-trained autoencoding language model that has revolutionized the NLP space in the past few years. BERT provides contextual embeddings, i.e., it provides context-aware vector representations of words that capture semantics far more efficiently than their context-free counterparts.
The EXTRA classifier contains a pre-processing module to cope with the inevitable noise in the output of standard Optical Character Recognition systems. The pre-processed plain text from a source document is then fed into a BERT-based classifier, which is built by extending pre-trained BERT with an additional linear layer trained for classification through a process commonly known as fine-tuning.
We will present preliminary results that confirm some clear benefits with respect to rule-based solutions in terms of classification performance and system scalability.
AI-SDV 2020: Using Transformer technology to build an AI based personal News ...Dr. Haxel Consult
After having successfully implemented a proof of concept in late 2019, DS9 started with one of its customers to implement a pilot for a Deep Learning based personal News Rating system.
The system was initially trained to rate news matching a limited number of typical user profiles. Over time the system would evolve and learn more and more about personal preferences and build user-specific rating models.
This talk looks at the many challenges we overcame during implementation: collecting training data, integrating many diverse news sources, management of many user-specific models and automatic deployment to production.
AI-SDV 2021 - Tony Trippe - The Current State of Machine Learning for Patent ...Dr. Haxel Consult
The use of machine learning in IP activities has increased exponentially over the past five years. At the same time new tools, methods and systems have begun to emerge that seek to make the analysis of patent data easier to accomplish using these techniques. Included in these new developments are a significant number of machine learning systems that have begun coming to market. As these changes continue to occur, it would be useful to review some of the tools, systems, or methods that a patent practitioner has at their disposal. Examples and perspectives on the latest advances in machine learning for IP will be provided. There will also be a tour of ML4Patents.com which is devoted to aggregating content associated with the development of this area.
A brief overview of Watson Cognitive solutions and an introduction into how they can be incorporated into the oil & gas arena. Dave Haake, Global C&P Solutions Executive, IBM Watson
Founded in 2004 by a group of business & technology entrepreneurs and inventors, Dolcera is a next generation patent and information analytics company. Our proprietary patent search strategy and domain expertise combined with our game changing AI/ML driven tools like PCS and E-Search enable us to offer top notch patent search and analytics services to our diverse clientele in key decision-making areas of IP strategy & creation, litigation, portfolio analysis, competitive intelligence, product development and licensing.
Our AI- driven tools and services include:
Dolcera PCS – A deep-learning driven superfast patent search engine that offers domain specific Semantic and taxonomy-oriented search capabilities for Patent portfolio analysis, Prior Art, Invalidation and licensing searches.
Dolcera Machine learning for Auto-categorization – The transparent, customizable system solves the problem of human bias or comprehension of a patent when classifying them by leveraging proprietary Dolcera AI/ML. The system gives complete control to the user to deliver highly accurate categorized documents with a quick turn around.
Dolcera Enterprise Search (E-Search) – Dolcera Enterprise Search is a machine learning enabled enterprise-wide search engine integrated with various value added features and smart charts. The application uses patent documents, scientific literature, product information or generally other technical documents inside a company. The system lets users have access to all the related files with minimal clicks to generate actionable insights.
Dolcera ETSI Dashboard – A collection of SEP from various technologies and various specifications updated regularly + PCS (AI driven patent search tool) links the SEPs declared under the specific technical standards & visualize the information for.
Our bespoke services include IP research, technical review, business research and newsletter alerts which help our clients make key strategic decisions in relation to:
Choosing the right technology
Filing & protecting the right IP
Standard mapping and claim charting
Product enhancement
IP, technical & business landscapes
Competitive intelligence
Regulatory landscape
The amalgamation of Dolcera’s AI driven tools, IP services & market research capabilities have time and again assisted our clients in managing their IP & non-IP assets systematically, while enhancing their overall decision-making process.
For further information visit: www.dolcera.com
Our Global presence:
USA: California, Chicago
Europe: Germany, UK
Asia: India, China
AI-SDV 2021: Francisco Webber - Efficiency is the New PrecisionDr. Haxel Consult
The global data sphere, consisting of machine data and human data, is growing exponentially reaching the order of zettabytes. In comparison, the processing power of computers has been stagnating for many years. Artificial Intelligence – a newer variant of Machine Learning – bypasses the need to understand a system when modelling it; however, this convenience comes with extremely high energy consumption.
The complexity of language makes statistical Natural Language Understanding (NLU) models particularly energy hungry. Since most of the zettabyte data sphere consists of human data, such as texts or social networks, we face four major obstacles:
1. Findability of Information – when truth is hard to find, fake news rule
2. Von Neumann Gap – when processors cannot process faster, then we need more of them (energy)
3. Stuck in the Average – when statistical models generate a bias toward the majority, innovation has a hard time
4. Privacy – if user profiles are created “passively” on the server side instead of “actively” on the client side, we lose control
The current approach to overcoming these limitations is to use larger and larger data sets on more and more processing nodes for training. AI algorithms should be optimized for efficiency rather than precision. In this case, statistical modelling should be disqualified as a brute force approach for language applications. When replacing statistical modelling and arithmetic, set theory and geometry seem to be a much better choice as it allows the direct processing of words instead of their occurrence counts, which is exactly what the human brain does with language – using only 7 Watts!
ICIC 2017: Technology Scouting: Decision Support in Strategic Analyses for Te...Dr. Haxel Consult
Stefan Geißler (Expert System Deutschland, Germany)
Tim Schloen (Fraunhofer IAO Stuttgart, Germany)
Trying to keep up to date with developments, trends and opportunities even in narrow domains involves digesting and considering large amounts of textual information and often is beyond the reading capabilities of human experts. Large organisations may be able to cope with this challenge with dedicated analysis teams, but SMEs are overcharge with this information load.
Under the label „Technology Scouting“ we present an approach to employ modern semantic technologies and artificial intelligence to (semi) automate the collection, analysis and reporting of large document collections in a way that allows to derive important insights for technology-driven companies: Which technologies and markets are emerging or moving, how do my clients, partners and competitors operate?
After an introduction to the technological and methodological basis, we present experiences from past industry engagements where this aproach has be applied in production.
AILANI is a novel and unique semantic search enterprise solution for fast, easy and comprehensive knowledge discovery. It combines semantic modelling, ontologies, linguistics and artificial intelligence (AI) algorithms in a self-refining system that delivers results based on inter-related meaning of facts. AILANI not just allows for phrase searches as well as structured queries, it offers its users a unique hybrid natural language question answering system combining machine learning algorithms with semantic network-based "prior knowledge" inference. It integrates seamlessly with existing infrastructure and helps leverage knowledge buried both in decade old data as well as data derived from news feeds and clinical trials providing real-time semantic analysis of breaking news. For the pharmaceutical industry it is critical to stay up to date with the latest clinical trials news for decision-making in drug development. Integration of the relevant data and using ontology-based refiners enables fast and efficient retrieval of information about the clinical competitive landscape.
AI-SDV 2021: Linus Wretblad - Best practice on new intelligent tools in IP ma...Dr. Haxel Consult
New tools are indeed present promising opportunities. However, the use of AI also raises new concerns regarding the transparency of the expected usability and reliability. This presentation will elaborate around some the associated questions; the impact of the training data, the black box symptoms, verification of performance and privacy considerations when using AI. But of course also some insights to best practice of AI search tool and how to take control of the process for better efficiency.
Overview of end-to-end lifecycle to productize and commercialize alternative datasets at S&P Global Market Intelligence
Benefits to discuss:
How S&P Market Intelligence develops new alternative datasets
How S&P Market Intelligence develops robust production processes for alternative data
S&P Global Market Intelligence GTM strategy and capabilities to sell alternative data
Biomax Informatics provides services and software solutions for efficient decision making and knowledge management at the intersection of life sciences, healthcare and information technologies. Biomax facilitates digital transformation within biotech, pharma, agriculture, food and chemical industries as well as research institutes.
Biomax offers a range of standard products, based on the core technology knowledge management technology BioXM™, which are synergistically interrelated.
AILANI™, the Artificial Intelligence LANguage Interface, provides unique semantic search capabilities that catalyze digital change and accelerate the innovation cycle.
NeuroXM™ is the one-stop-shop to decipher brain physiology.
The Clinical Integration System ensures access to real world evidence data, which is critical to effectively and robustly train Artificial Intelligence to support clinical decision support at the point of care.
With more than 20 years of experience and around 50 employees - including numerous life scientists, data scientists and software developers with a scientific background - Biomax is a competent partner.
Founded in 1997, Biomax is ISO 9001 and ISO 27001 certified and is headquartered in Planegg near Munich, Germany.
More info @ www.biomax.com
For over 30 years, Search Technology, Inc. has helped our customers turn information into knowledge. We provide software tools and services that extract more value from patent, scientific, technical and business databases. Our primary product, VantagePoint, helps you rapidly understand and navigate through search results, giving you a better perspective - a better vantage point - on your information. Discover why many of today’s Fortune 100 companies use VantagePoint to help them succeed. VantagePoint is Serious Software for Serious Professionals.
Find more on: http://www.TheVantagePoint.com/
AI-SDV 2020: AI-augmented Question Answering and Semantic Search for Life Sci...Dr. Haxel Consult
We have created structure based chemical ontologies that are used to classify chemical compounds automatically. These classifications can be used with success in semantic search engines to find all representatives of a chemical class. In the present paper we would like to demonstrate use cases when utilizing these chemical classes as features in typical machine learning approaches.
Thus, we have used the co-occurrence of chemical compounds with biological and physico-chemical properties in scientific articles to train models that predict properties of novel compounds that did not occur in those training sets. One example is the prediction of hepatotoxicity as well as bioavailability. In principle, one can use any property that is found in the textual vicinity of compounds to build such predictive models. Criteria will be presented that allow to judge the quality and predictive power of such models.
ICIC 2017: The Use of Patent Information for Innovation and Competitive Intel...Dr. Haxel Consult
Jochen Lennhof (Minesoft, Germany)
Patent data is a critical source of information to stimulate innovation and for competitive intelligence. Patents are often the first and only source of disclosure of a new invention and hence, ignoring them will only delay innovation and give an incomplete competitive intelligence picture.
Delivered from the perspective of a patent analyst, we will use case studies to describe the use of patent data to compile a competitive landscape, to stimulate innovation by learning from others and to help identify valuable IP in a portfolio. We will discuss the challenges in using patents for competitive intelligence and the recent innovative features and functionality in PatBase which can help, including:
Using thesauri, semantic and non-patent literature searching to compile a comprehensive competitive landscape.
The use of Analytics for customised, multidimensional analysis and to visually compare multiple datasets.
Text-mining to automatically identify and highlight concepts within any full text patent.
Citation analysis to identify key competitors, collaborators or potential infringers.
This presentation will demonstrate how any user can benefit from the innovative features and functionality in PatBase to interrogate and visualize the competitive landscape for any technical area.
AI-SDV 2021 - Klaus Kater - The secret of successful CI: precise targeting + ...Dr. Haxel Consult
New technologies like CRISPR-CAS or mRNA based vaccines require CI teams to constantly screen the market for opportunities and threats. The need for more focused intelligence that is targeted to the company’s strategic plans and faster access to critical information for decision makers has been increasing dramatically: More opportunities can be turned into a competitive advantage and management can avert potential threats before they become problematic. Monitoring early technology development, finding licensing opportunities or acquisition targets as well as quick access to broad clinical trial information and close surveillance of the competition are key disciplines of CI teams in R&D;
Modelling Customer Lifetime Revenue for Subscription BusinessDatabricks
Customer lifetime Value/Revenue(LTV/R) is the present value of the future profits/revenue from a customer. Estimating it, is important for businesses to optimise the marketing costs in acquiring and retaining the customers. Complex consumer behaviour and innumerable ways a consumer interacts with the business makes things challenging to estimate it. Years of ongoing research in this field has led to the development of various ML tools and techniques. We would like to take this opportunity to walkthrough some of these techniques and their applications in specific business contexts.
Condé Nast is a global media company that produces some of the world’s leading print, digital, video and social brands. These include Vogue, GQ, The New Yorker, Vanity Fair, Wired and Architectural Digest (AD), Condé Nast Traveler and La Cucina Italiana, among others. Subscription revenue is one of the major revenue streams for the organization and we’d like to demonstrate the implementation of LTV/R model for the subscription revenue for one of the brands using survival models and along with that illustrate the following.
Estimate the average lifetime (ALT) of a brand’s subscriber.
Estimate the average lifetime of various segments within the brand and identify the most valuable/least valuable segments, so marketing teams could device appropriate targeting strategies.
Finally attempt to estimate the lifetime at a subscriber level.
Key insights & findings through the analysis.
Demo of sample code
Leveraging databricks delta files for our big data processing needs.
Biomax provides computational solutions for better decision making and knowledge management in the life science industry. Biomax helps customers generate value from proprietary and public resources by extracting the knowledge indispensable for efficient data exploration and interpretation. They focus on integrating information to enable a knowledge-based approach to develop innovative life science products. The company supports its customers with a platform that combines software products with knowledge resources, including oncology, nutrigenomics, plant research and functional genomics. With the launch of the NeuroXM Brain Science Suite in 2018 Biomax offers products tailored for the field of connectome research. The new Semantic Searching Platform AILANI provides a corporate-wide knowledge repository accessible for everyone, any time and from anywhere. Biomax’s worldwide customer community includes companies and research organizations that are successful in the areas of drug discovery, diagnostics, fine chemicals, food and plant production.
Highlighting how Refinery Advisor enhances decision making by oil & gas professionals by applying the organizations collective knowledge to large complex data sets, in context. -Scott Kimbleton, Associate Partner, Chemicals & Petroleum Global Business Services
TechWiseTV Workshop: Improving Performance and Agility with Cisco HyperFlexRobb Boyd
Find out how organizations like yours are deriving business value from the HyperFlex HCI solution. Join us for a deep dive and Q&A at the TechWiseTV workshop.
TechWiseTV Hyperflex 4.0 Episode: http://cs.co/9009EW2Td
Dissemination Patterns of Technical Knowledge in the IR Industry: Scientometric Analysis of Citations in IR-related Patents
Ricardo Eito-Brun (Universidad Carlos III de Madrid, Spain)
The purpose of this paper is to identify the most influential institutions and journals on information retrieval and text mining through the analysis of the citations in the patents issued in the period between 1990 and 2013.
Bibliographic citations received by different academic journals in a representative set of patents related to the text mining area are analyzed applying sound and consolidated statistical techniques. Besides identifying the most influential academic journals, conferences and institutions in the period under study, the conclusions of this research are also useful to identify the most relevant and productive organizations (those with a higher number of patents) and those organizations whose patents have received a major number of citations.
The analysis also permits to obtain a general view of the disseminations patterns in the consumption of the products of academic and technical research. The period under study offers interesting conclusions regarding the impact of the Web on the Intellectual Property Rights (IPR) strategies of companies building Information Retrieval and Text Mining software solutions, and how the information retrieval and access industry has evolved in the recent years.
Why networking organizations are so valuable in patent information - together we are strong
Monika Hanelt (Agfa Graphics, Belgium)
During this talk it will outlined why the patent information community needs networking organizations - national working groups as well as multinational organizations as PDG and Cepiug here in Europe. The Patent Documentation Group (PDG) is one of the oldest organization in this field and the patent information communtiy owes a lot to PDG and their working groups. Progress regarding data availability and quality due to this organization will be examplified. In recent years the networking of patent offices extended the range of patent information and improved the quality of data as well. A look on actual and future challenges will conclude.
ICIC 2013 Conference Proceedings David Milward LinguamaticsDr. Haxel Consult
Unstructured Text in Big Data: the Elephant in the Room
David Milward (Linguamatics, UK)
A traditional approach has been to extract structured data from the unstructured text. When it comes to big data, manual or semi-automated methods just don't scale. Even with fully automated methods, it is infeasible to extract all the facts and relationships buried in the text to produce a 'knowledge base of everything' that answers every possible end-user question. In recent years, there has therefore been a trend towards using text mining over unstructured data to directly answer questions, effectively creating novel databases on the fly. However, this has widened the gap between information professionals and end users.
In this talk I will explain how multiple approaches are being adopted to exploit the skills of information professionals to improve decision support for end users. These include specialised real-time querying, alerting based on semantic queries, semantic enrichment of documents, and population of semantic stores or linked data.
Machine Learning to Turbo-Charge the Ops Portion of DevOpsDeborah Schalm
Already on a continuous or short-cycle delivery? Constantly rewiring your apps with microservice and similar architectures? Maintaining visibility and maximizing service levels once this stuff gets into production could be a regular nightmare. Coding instrumentation into your apps is time-consuming and error-prone. Instead, let machine learning do the work of adapting your monitoring to your fast-moving application environments. In this webcast learn about various types of machine learning that are optimized for operational data, and see in a demo how this could be leveraged to ensure your ops move as fast as rest of your DevOps pipeline.
Smarter Event-Driven Edge with Amazon SageMaker & Project Flogo (AIM204-S) - ...Amazon Web Services
A single device can produce thousands of events every second. In traditional implementations, all data is transmitted back to a server or gateway for scoring by a machine learning (ML) model. This data is also stored in a data repository for later use by data scientists. In this session, we explore data science techniques for dealing with time series data leveraging Amazon SageMaker. We also look at modeling applications using deterministic rules with streaming pipelines for data prep, and model inferencing using deep learning frameworks directly onto edge devices or onto AWS Lambda using Project Flogo, an open-source event-driven framework. This session is brought to you by AWS partner, TIBCO Software Inc.
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...Kai Wähner
In 2015, I had two talks about Enterprise Integration Patterns at OOP 2015 in Munich, Germany and at JavaDay 2015 in Kiev, Ukraine. I reused a talk from 2013 and updated it with current trends to show how important Enterprise Integration Patterns (EIP) are everywhere today and in the upcoming years.
This presentation provides an overview of the Rapise automated testing tool from Inflectra. It provides an background on why you need to use automated testing as part of your development process and the features and differentiators that make Rapise your best choice for testing web, mobile, desktop, mainframe and api applications.
This presentation provides an overview of the Rapise automated testing tool from Inflectra. It provides an background on why you need to use automated testing as part of your development process and the features and differentiators that make Rapise your best choice for testing web, mobile, desktop, mainframe and api applications.
Top 10 Most Demand IT Certifications Course in 2020 - MildainTrainingsMildain Solutions
The professionals in the field of Information Technology understands the importance of certification to their career and growth.
The information provided in this guide is backed by real data. Let us look at the top IT certifications that will remain to be a trend in 2020.
Mildaintrainings https://mildaintrainings.com/ offers Several trainings all over the world.
Join us to see how Public-sector organizations and AWS Partners are combining Smart Devices and Artificial Intelligence to create flexible, secure and cost-effective solutions. Applying machine learning models to live video/audio, cameras can be transformed into flexible IoT devices that perform critical functions around public safety, security, property management, smart parking & environmental management. Learn how these solutions are architected using AWS services such as AWS IoT Core, AWS GreenGrass, AWS DeepLens, Amazon SageMaker and Amazon Alexa.
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...InfluxData
Ezako is a startup specializing in time series analysis. Ezako helps its clients detect anomalies and label their time series data. It helps accelerate the labeling process and analyze vast amounts of data from a variety of sensors in real-time. The company provides anomaly insights and makes it easier for data scientists. Ezako is the creator of Upalgo, which is a time series data management tool that uses AI to automatically detect anomalies in streaming data.
During this webinar, Ezako will dive into how high-frequency sensors can generate huge amounts of data which can become desynchronized. This can result in data quality issues as it can contain errors and glitches. Ezako uses machine learning, labelling and feedback loops to identify these errors. Discover how the company helps improve its clients’ data quality and reduce the number of validation mistakes.
Top Artificial Intelligence Tools & Frameworks in 2023.pdfYamuna5
Artificial intelligence has facilitated the processing and use of data in the business world. With the growth of AI and ML, data scientists and developers now have more AI tools and frameworks to work with. We believe it's important for machine learning platforms to be easy to use for business people who need results, but also powerful enough for technical teams who want to push the boundaries of data analysis with customizable extensions. The key to success is choosing the right AI framework or machine learning library.
Do you have a true Big Data Analytics platform? What's a true Big Data Analytics platform? How can it help capitalize big data? What's needed to build one? This short introductory presentation can help understand what's a true Big Data Analytics platform and how it really helps building Big Data Analytics applications.
Single Source of Truth for Network AutomationAndy Davidson
The importance of building a single source of truth for information within your organisation, when you embark upon a network automation project. Simply automating router configuration steps is not "network automation".
Stefan Geissler kairntech - SDC Nice Apr 2019 Stefan Geißler
Describes the Kairntech approach to real-world NLP/AI requirements, putting an emphasis on the quick and efficient creation and curation of training data sets.
Generative AI in CSharp with Semantic Kernel.pptxAlon Fliess
Join Alon Fliess, Azure MVP, and Microsoft RD in an enlightening lecture where C# meets the forefront of AI. Discover how the Semantic Kernel project bridges traditional programming with advanced AI, empowering C# developers to integrate AI functionalities into their software seamlessly.
Experience a paradigm shift in diagnostics through a real-world example: a sophisticated system crafted with C#, Semantic Kernel, and Azure. Witness the synergy of C# and AI in action, optimizing system analysis and problem-solving in complex environments.
Embark on a journey where C# and AI meet.
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...Dr. Haxel Consult
Knowledge Graphs are an increasingly relevant approach to store detailed knowledge in many domains. Recent advances in NLP allow to enrich Knowledge Graphs through automated analysis of large volumes of literature, reducing a lot the efforts in traditional manual information capturing. In our presentation we report the approach taken in a project with partner Fraunhofer SCAI in the life sciences where a knowledge graph organising detailed facts about psychiatric diseases has been computed.
Information of cause-effect relations between proteins, genes, drugs and diseases has been encoded in the BEL (Biological Expression Language) and imported into a Graph database to approach an indication-wide Knowledge Graph for the selected therapeutic area. Ultimately, updating the graph will amount to just rerunning the analysis on the newly published literature.
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...Dr. Haxel Consult
In 2019 the UK was the first major economy to embrace a legal obligation to achieve net zero carbon emissions by 2050. More broadly, the 2021 UK Innovation Strategy sets out the UK government’s vision to make the UK a global hub for innovation by 2035 with a target of increasing public and private sector R&D expenditure to 2.4% of GDP to support the UK being a science superpower with a world-class research and innovation system.
IP rights create an incentive for R&D which ultimately leads to innovation. Analysis and insights from IP data can therefore help provide a better understanding of how the IP system is being used and where and what innovation is taking place. Research and analysis of IP data is a key input to the ongoing work of the UKIPO’s Green Tech Working Group which seeks to:
further the UK’s status as a global leader by making the UK’s IP environment the best for innovating green technology;
develop and deliver IP policies to support government’s ambition on climate change and green technologies; and
to help innovators best protect and commercialise their green tech innovations both at home and internationally.
The UKIPO has been developing a broad portfolio of ‘green’ IP analytics research. A series of patent analytics reports have been published looking at green technologies, and analysis of how the UK’s Green Channel scheme for accelerated processing of green patent applications has been conducted. Patents have been used to identify technological comparative advantage within different green technologies at a country level, and new insights uncovered by mapping green technology patents to the UN Sustainable Development Goals (SDGs). Trade mark data provides a timeliness and closeness to market factor that patent data does not, and complementary trade mark analysis of UK ‘green’ trade marks, identified using a machine learning algorithm, provides a commercialisation angle to our research.
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...Dr. Haxel Consult
Word embeddings, deep learning, transformer models and other pre-trained neural language models (sometimes recently referred to as "foundational models") have fundamentally changed the way state-of-the-art systems for natural language processing and information access are built today. The "Data-to-Value" process methodology (Leidner 2013; Leidner 2022a,b) has been devised to embody best practices for the construction of natural language engineering solutions; it can assist practitioners and has also been used to transfer industrial insights into the university classroom. This talk recaps how the methodology supports engineers in building systems more consistently and then outlines the changes in the methodology to adapt it to the deep learning age. The cost and energy implications will also be discussed.
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...Dr. Haxel Consult
In the patent domain, all types of issues, from very specific search requirements to the linguistic characteristics of the text domain, are accentuated. Consequently, to develop patent text mining tools for scientists and patent experts, we need to understand their daily work tasks, as well as the linguistic character of the text genre (i.e., patentese). Patent text is a mixture of legal and domain-specific terms. In processing technical English texts, a multi-word unit method is often deployed as a word-formation strategy to expand the working vocabulary, i.e., introducing a new concept without the invention of an entirely new word. This productive word formation is a well-known challenge for traditional natural language processing tools utilizing supervised machine learning algorithms due to limited domain-specific training data. Deep learning technologies have been introduced to overcome the reduction in performance of traditional NLP tools. In the Artificial Researcher technologies, we have integrated explicit and implicit linguistic knowledge into the deep learning algorithms, essential for domain-specific text mining tools. In this talk, we will present a step-by-step process of how we have developed the mentioned text mining tools. For the final outline, we will also demonstrate how these tools can be integrated in a cross-genre passage retrieval system, based on a technology from 2016 that still holds the state-of-the-art within the patent text mining research community in 2022.
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...Dr. Haxel Consult
In 2013 we witnessed an evolutionary change in the NLP field evolved thanks to the introduction of space embeddings that, with the use of deep learning architectures, achieved human-level performances in many NLP tasks. With the introduction of the Attention mechanism in 2017 the results were further improved and, as result, embeddings are quickly becoming the de facto standards in solving many NLP problems. In this presentation, you will learn how generate and use space embedding for search purposes and provide comparison metrics to more traditional relevance-based search engines. Moreover, I will provide some initial results on a paper currently under review that provides an insight on hyperparameter tuning during the generation of embeddings.
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...Dr. Haxel Consult
10 years in the making. How real-world business cases have driven the development of CCC's deep search solutions, leading to the capabilities for web-crawling and delivery of targeted intelligence that helps R&D; intensive companies gain a competitive advantage.
AI-SDV 2022: Machine learning based patent categorization: A success story in...Dr. Haxel Consult
Machine learning based patent categorization: A success story in monitoring a complex technology with high patenting activity
Susanne Tropf (Syngenta, Switzerland)
Kornel Marko (Averbis, Germany)
AI-SDV 2022: Machine learning based patent categorization: A success story in...Dr. Haxel Consult
Machine learning based patent categorization: A success story in monitoring a complex technology with high patenting activity
Susanne Tropf (Syngenta, Switzerland)
Kornel Marko (Averbis, Germany)
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...Dr. Haxel Consult
It is relatively easy for a human to read a document and quickly figure out which concepts are important. However, this task is a difficult challenge for a machine. During the past few decades, there have been two main approaches for concept identification: Natural Language Processing and Machine Learning. During the early part of this century, Machine Learning made great strides as new techniques came into wider use (SVM’s, Topic Modeling, etc..). Sensing the competition, Natural Language Processing responded with deployment of new emerging techniques (sematic networks, finite state automata, etc..). Neither approach has completely solved the WHAT problem. Advances in Artificial Intelligence have the potential to significantly improve the situation. Where AI is making the most impact is as an enhancement to make Machine Learning and Natural Language Processing work better and, more importantly, work together. This presentation looks at some of this history and what might happen in the future when we blend the interpretation of language with pattern prediction.
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...Dr. Haxel Consult
Trademarks serve as key leading indicators for innovation and economic growth. As the vanguards of new and expanding enterprises, trademarks can be used to study entrepreneurship and shifting market demands in response to varying economic factors. This responsiveness has been seen as recently as the COVID-19 pandemic, where trademark research revealed key insights about business reaction to the global upheaval.
At CIPO, we have been delving more deeply than ever before into trademark analysis by leveraging cutting-edge natural language processing (NLP) tools to derive actionable business intelligence from trademark data. In this presentation, we present a survey of NLP in use at CIPO and the insights we have learned applying them. These insights include COVID-19 responses, line-of-business trends based on firm characteristics, and more.
We also discuss ongoing and future trademark research projects at CIPO. These projects include emerging technology detection methods and high-resolution trademark classification systems. We conclude that artificial intelligence-enhanced tools like NLP are key components of future exploitation of trademark data for business and economic intelligence.
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...Dr. Haxel Consult
In our customer projects involving automated document processing, we often encounter document types providing crucial data in the form of tables. While established text analytics algorithms are usually optimized to operate on running text, they tend to produce rather poor results on tables as they do not capture the non-sequential relations inside them (e.g. interpret the content of a table cell relative to its column title, interpret line breaks inside a cell differently from line breaks between cells or rows). While there are elaborate information extraction products in the market for a few highly specific types of tabular documents, there is no general approach out there. The main cause for this is the fact that table structures can be encoded by a heterogenous range of layout means (e.g. column boundaries can be signaled by lines vs. aligned text vs. white space). In this talk, we will illustrate several solutions that we have developed for a range of challenges occurring in this context, both for scanned and digitally generated documents.
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...Dr. Haxel Consult
Most scientific journals request, that the complete set of research data is published simultaneously with the peer-reviewed paper. The publication of the research data usually is carried out as so-called "Supplementary Material", attached to the original paper, or on a "Research Data Repository". Both forms have in common, that the data is published usually unstructured and not in an uniform machine processable format. This makes its further use in electronic tools for AI or data mining unnecessarily difficult or even impossible. A concept is presented, in which the data is digitally recorded, following the principle of FAIR data, as part of the publication process. This digital capture makes the data available to the scientific community for easy use in data mining and AI tools. The data in the repository contains links to the publication to document its origin. The concept is applicable for preprints, peer-review papers, diploma and doctoral theses and is particularly suitable for open access publications. Moreover, the presentation highlights correspondent activities, which were released in scientific publications recently.
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...Dr. Haxel Consult
How do you find video when you only have sparse data? While you can wander the stacks (if you can still find open stacks) for inspiration, video either physical or digital, is difficult to discover. Wandering the virtual stacks is, well, virtually impossible. Discovery platforms on the whole have not replicated the inspirational experience of wandering the stacks.
More companies are using archivable video for internal communication of the various research projects, product developments, test results, and more that are being considered, in progress, or completed. Showing how an experiment was conducted can convey considerably more information that is very difficult to communicate via text. How do you find a company video that might be helpful for your project?
A case study is presented of the problems and the solutions that were implemented by a large, multinational chemical company. A suite of content discovery technologies was used including a video to text to tagging system connected to their documents database and automatically indexed using several chemical as well as conceptual systems (rule-based, NLP, inference engine). To build the system and support the manuscript and video submission there is a metadata extraction program which pulls and inserts the metadata into the submission forms so the author can move quickly through that process.
Copyright Clearance Center
A pioneer in voluntary collective licensing, CCC (Copyright Clearance Center) helps organizations integrate, access, and share information through licensing, content, software, and professional services. With expertise in copyright and information management, CCC and its subsidiary RightsDirect collaborate with stakeholders to design and deliver innovative information solutions that power decision-making by helping people integrate and navigate data sources and content assets. CCC recently acquired the assets and technology of Deep SEARCH 9 (DS9), a knowledge management platform that leverages machine learning to help customers perform semantic search, tag content, and discover new insights.
Lighthouse IP is the world’s leading provider of intellectual property content. The core business of Lighthouse IP is sourcing and creating content from the world’s most challenging authorities. Specialized in IP data, Lighthouse IP provides over 160 countries coverage for patents, over 200 authorities for trademarks and over 90 authorities for designs. Lighthouse IP data is available via several partners. The company is headquartered in Schiphol-Rijk in the Netherlands and has offices in the United States, China, Thailand, Vietnam, Egypt, Indonesia and Belarus. Globally a team of 150 experts works on the creation of this unique data collection.
CENTREDOC was created in 1964 as the technical information center of the swiss watchmaking industry. Building on a strong team of engineers, CENTREDOC now offers a complete range of services and solutions for the monitoring of strategic, technological and competitive information. CENTREDOC is also a leader in the research of patent, technical and business intelligence, and offers consulting expertise in the implementation of monitoring solutions.
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...Dr. Haxel Consult
The everyday use of AI-driven algorithms for data search, analysis and synthesis comes with important time savings, but also reveals the need to understand and accept the limitations of the technology. Practical deployments on concrete topics are inevitable to assess and manage the challenges of neuronal network based AI. A workshop report.
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...Dr. Haxel Consult
What if there was a platform where literature, conference abstracts, patents, clinical trials, news, grants and other sources were fully integrated? What if the data would be harmonized, enriched with standardized concepts and ready for analysis? After building our patent analytics platform we didn’t stop dreaming and built our big data analytics platform by semantically integrating text-rich, scientific sources. In my presentation I will talk about what we built and why we built it. And, of course, I will also address the challenges and hurdles along the way. Was it worth it and what comes next? Let’s talk about it!
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
5. Businesses are threatened by the BIG DATA explosion
A substantial part of this data consists of text
Social Nets
Blogs
Documents
Numbers
Webpages
Email
News
6. Semantic processing of natural language
becomes essential for businesses...
(1) to locate documents
(2) to find web content
(3) to match the skills of people
(4) to identify and select products
(5) to monitor competitors
(6) to file business information
(7) to drive product innovation
(8) to track customer satisfaction
(9) to avoid duplication of work
(10) to advertise on the Internet
(11) to mine for evidences
(12) to improve security
7. The Downsides
of the state-of-the-art
High quality semantic systems are:
•Hard to build (sometimes impossible)
•Inaccurate and fragile (in real-world use)
•Expensive to buy (licenses and/or services)
•Tricky to integrate (setup, tuning, training ...)
•Laborious to run (metadata management)
•Time-consuming to maintain (dict., ontologies)
10. cat
cat + dog
dog
The fingerprints allow direct semantic comparison
of the meanings of any two words
11. cat
cat + dog
Home
and family
aspects
dog
Biology
aspects
Leisure
and free
time
aspects
Cat
specific
Dog
specific
Social
aspects
The CEPT-Retina shows
thousands of semantic relations for any two words
13. Example documents
provided by user for
a specific field of
interest
document fingerprints
profile fingerprint
Document Fingerprints are aggregated into
User-Profile Fingerprints
14. Similar Documents
Result-Set
query fingerprint
Similarity Engine
Profile FP
Profile Ranking
Fingerprints of
indexed document
collection
The high performance similarity algorithm provides
the Semantic Search Engine functionality
16. CEPT technology is offered as cloud based SaaS
✓It is simple to use, efficient and cost-effective
✓It can be easily embedded in diverse applications
✓It is licensed to developers as a web-service API
✓It is offered as transaction-based subscription-model
17. Strategy to ensure developer adoption
✓
No Natural Language Processing experience needed by
developers
✓
Existing domain developers can immediately start using it
✓
Business people can be part of the development team
✓
CEPT-API is offered as 12 different flavors, each one optimized for
a specific business case
18. Overview 12 Disciplines of Information Retrieval
1.
2.
3.
4.
5.
Enterprise Search - Document Management
Website Search - Marketing, Sales, Support
Profile Matching - Recruiting, Marketplaces, Dating
Product Search - E-Commerce, Retail, Wholesale
Alerting Systems - Competitive Intelligence
6.
7.
8.
9.
Content Classification - News Filtering, Document Classifier
Information Discovery - Innovation Support
Opinion Mining - Social Network / Peer groups Analysis
Office Automation - Intelligent Office Productivity System
10. Keyword Generation - Search Engine Optimization
11. Forensic Text Analytics - Evidence Mining, Plagiarism Detection
12. Intelligence & Security - Real Time Message Monitoring
19. CEPT Service Technology Stack
Demos
cortical.io
Demos
Example Code Example Code
and
and
Widgets
Widgets
Tutorials
Discipline 2
Website
Search API
CEPT API
Retina
En
De
Fr
Sp
Chem
Med
Bio
Pharm
Monitoring
Expression Engine
Accounting
Demos
Example Code Example Code
and
and
Widgets
Widgets
...
Tutorials
Discipline 1
Enterprise
Search API
Demos
Tutorials
Tutorials
Discipline 11
Forensic
Analytics API
Discipline 12
Intelligence
Security API
Similarity Engine
Key Management
Identity Management
Service Management Layer (3Scale)
Elastic Beanstalk
Dynamo DB
Simple Workflow Service
Cloud Infrastructure (Amazon)
Elastic Block Store
22. ✓
✓
✓
✓
Lowering Cost: The most expensive part of producing language learning
materials are the pedagogic editors. Phase-6 can lower these costs by
maintaining a repository of 1Billion high quality example sentences, that are
quality checked using the CEPT-API.
Improve Learner Motivation: The learner can choose a topic of interest
and all the examples, exercises and tests are generated within this topic using
the CEPT-Retina.
Individualized Tests: Traditional e-learning systems just discriminate
between „right“ and „wrong“. Using the CEPT-API allows Phase-6 to give
more specific answers like „nearly right“ or „you are on your way“.
Intelligent Tutoring: Using the CEPT-API the Phase-6 system can give
intelligent hints, depending on the learners chosen topic, his skill level and the
kind of error he made before.