This presentation describes two modes of web-based knowledge acquisition in the domain of bioinformatics. "Pull" models such as social tagging systems that engage passive altruism and "push" models such as the Mechanical Turk that actively guide and incentivise the knowledge acquisition process.
Large data sets comprising multiple correlating attributes may include phenomena hard to identify and understand using traditional data analysis and visualization methods. HeatMiner is a new visual data mining technology which visualizes the data as three-dimensional heatmaps. Even complex patterns missed by other methods are easy to recognize from 3D-heatmaps with a single glance. Go and try HeatMiner with your own data at the Cloud’N’Sci.fi Algorithms-as-a-Service marketplace!
Presentacion Festival Agua Viva Canarias - Atun rojoSebastián Losada
Presentación realizada en el Festival AguaViva Canarias sobre el uso de mediadas espaciales para la protección del atún rojo / Presentation on the use of spatial measures for the protection of bluefin tuna at the AquaViva Canarias Festival
Large data sets comprising multiple correlating attributes may include phenomena hard to identify and understand using traditional data analysis and visualization methods. HeatMiner is a new visual data mining technology which visualizes the data as three-dimensional heatmaps. Even complex patterns missed by other methods are easy to recognize from 3D-heatmaps with a single glance. Go and try HeatMiner with your own data at the Cloud’N’Sci.fi Algorithms-as-a-Service marketplace!
Presentacion Festival Agua Viva Canarias - Atun rojoSebastián Losada
Presentación realizada en el Festival AguaViva Canarias sobre el uso de mediadas espaciales para la protección del atún rojo / Presentation on the use of spatial measures for the protection of bluefin tuna at the AquaViva Canarias Festival
Presentation for the Meeting of the network Scientists for Cycling (ECF) in Sevilla, 2011. It shows information about Esther Anaya's work and research fields.
Pascal Hartmann is a German sociologist and an experienced strategy executive and theory designer. He is also the Director of the R & D Department at Logon Architecture. With an eye to the future, his work embraces the architectural heritage of the city in a sustainable fashion.
Gene Wiki and Mark2Cure update for BD2KBenjamin Good
An introduction to the Gene Wiki project with an emphasis on the use of the new WikiData project. Also describes mark2cure, a citizen science initiative oriented on biomedical text mining.
Presentation for the Meeting of the network Scientists for Cycling (ECF) in Sevilla, 2011. It shows information about Esther Anaya's work and research fields.
Pascal Hartmann is a German sociologist and an experienced strategy executive and theory designer. He is also the Director of the R & D Department at Logon Architecture. With an eye to the future, his work embraces the architectural heritage of the city in a sustainable fashion.
Gene Wiki and Mark2Cure update for BD2KBenjamin Good
An introduction to the Gene Wiki project with an emphasis on the use of the new WikiData project. Also describes mark2cure, a citizen science initiative oriented on biomedical text mining.
Update on the gene wiki project, introduction to knowledge.bio semantic search application, introduction to biobranch.org collaborative decision tree creator
Solutions for the Texas Energy Shortage Rick Borry
Ron Seidel, PE, principal at RBS Energy Consulting and Principal Solar, Inc. board member will discuss and answer questions about his recent whitepaper, "Solutions for the Texas Energy Shortage."
Ron's whitepaper is very timely because in the summer of 2011, Texas experienced extremely low reserve margin periods throughout the state... causing average wholesale electricity prices to skyrocket to more than twice their normal level. Given that Texas is expected to add another 14 million to its population between 2010 and 2030, these shortages raise alarms about the state's ability to meet future energy demand. Success will depend upon finding the most effective way to incent the development of more capacity.
Unlike many other states, Texas has had a competitive retail market for electricity since 2001, replacing the traditional cost of a service-based regulated market. The market requires customers to choose a competitive electricity supplier and allows retail suppliers to set their prices without regulatory interference. However, regulatory action has resulted in caps being placed on system-wide wholesale power prices with the intent of protecting consumers. It is these system-wide offer caps that have limited prices, reduced potential profitability for wholesalers and restrained the development of new generation.
Download the complete whitepaper at www.principalsolarinstitute.org/documents.
Chief Product Officer Tim Brown's presentation from Exponential Sydney's Digital Leaders Dinner, delivering insights around the future and evolution of online advertising and attribution. For more details, tweet us at @exponentialinc or visit our website: www.exponential.com
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsBenjamin Good
The Gene Ontology (GO) Consortium (GOC) is developing a new knowledge representation approach called ‘causal activity models’ (GO-CAM). A GO-CAM describes how one or several gene products contribute to the execution of a biological process. In these models (implemented as OWL instance graphs anchored in Open Biological Ontology (OBO) classes and relations), gene products are linked to molecular activities via semantic relationships like ‘enables’, molecular activities are linked to each other via causal relationships such as ‘positively regulates’, and sets of molecular activities are defined as ‘parts’ of larger biological processes. This approach provides the GOC with a more complete and extensible structure for capturing knowledge of gene function. It also allows for the representation of knowledge typically seen in pathway databases.
Here, we present details and results of a rule-based transformation of pathways represented using the BioPAX exchange format into GO-CAMs. We have automatically converted all Reactome pathways into GO-CAMs and are currently working on the conversion of additional resources available through Pathway Commons. By converting pathways into GO-CAMs, we can leverage OWL description logic reasoning over OBO ontologies to infer new biological relationships and detect logical inconsistencies. Further, the conversion helps to increase standardization for the representation of biological entities and processes. The products of this work can be used to improve source databases, for example by inferring new GO annotations for pathways and reactions and can help with the formation of meta-knowledge bases that integrate content from multiple sources.
Pathways2GO: Converting BioPax pathways to GO-CAMsBenjamin Good
Presentation at the Gene Ontology Consortium Annual Meeting. Describing the automatic conversion of biochemical pathways in the Reactome Knowledge Base into the Gene Ontology 'Causal Activity Model' representation.
Building a Biomedical Knowledge Garden Benjamin Good
Describes the tribulations of building a large biomedical knowledge graph. Provides a comparison between the UMLS and Wikidata in terms of content and structure. Concludes with the idea of anchoring the knowledge graph in Wikidata items and properties.
When the Heart BD2K grant was originally written. We proposed to build something called “Big Data World” to help advance citizen science, scientific crowdsourcing and science education – especially in bioinformatics. This past year, this idea has become Science Game Lab ( https://sciencegamelab.org ) . A collaboration between the Su laboratory at Scripps Research, Playmatics LLC, and recently the creators of WikiPathways.
Opportunities and challenges presented by Wikidata in the context of biocurationBenjamin Good
Abstract—Wikidata is a world readable and writable knowledge base maintained by the Wikimedia Foundation. It offers the opportunity to collaboratively construct a fully open access knowledge graph spanning biology, medicine, and all other domains of knowledge. To meet this potential, social and technical challenges must be overcome - many of which are familiar to the biocuration community. These include community ontology building, high precision information extraction, provenance, and license management. By working together with Wikidata now, we can help shape it into a trustworthy, unencumbered central node in the Semantic Web of biomedical data.
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery Benjamin Good
PubMed now indexes roughly 25 million articles and is growing by more than a million per year. The scale of this “Big Knowledge” repository renders traditional, article-based modes of user interaction unsatisfactory, demanding new interfaces for integrating and summarizing widely distributed knowledge. Natural language processing (NLP) techniques coupled with rich user interfaces can help meet this demand, providing end-users with enhanced views into public knowledge, stimulating their ability to form new hypotheses.
Knowledge.Bio provides a Web interface for exploring the results from text-mining PubMed. It works with subject, predicate, object assertions (triples) extracted from individual abstracts and with predicted statistical associations between pairs of concepts. While agnostic to the NLP technology employed, the current implementation is loaded with triples from the SemRep-generated SemmedDB database and putative gene-disease pairs obtained using Leiden University Medical Center’s ‘Implicitome’ technology.
Users of Knowledge.Bio begin by identifying a concept of interest using text search. Once a concept is identified, associated triples and concept-pairs are displayed in tables. These tables have text-based and semantic filters to help refine the list of triples to relations of interest. The user then selects relations for insertion into a personal knowledge graph implemented using cytoscape.js. The graph is used as a note-taking or ‘mind-mapping’ structure that can be saved offline and then later reloaded into the application. Clicking on edges within a graph or on the ‘evidence’ element of a triple displays the abstracts where that relation was detected, thus allowing the user to judge the veracity of the statement and to read the underlying articles.
Knowledge.Bio is a free, open-source application that can provide, deep, personal, concise, shareable views into the “Big Knowledge” scattered across the biomedical literature.
Application: http://knowledge.bio
Source code: https://bitbucket.org/sulab/kb1/
Building a massive biomedical knowledge graph with citizen scienceBenjamin Good
The life sciences are faced with a rapidly growing array of technologies for measuring the molecular states of living things. From sequencing platforms that can assemble the complete genome sequence of a complex organism involving billions of nucleotides in a few days to imaging systems that can just as rapidly churn out millions of snapshots of cells, biology is truly faced with a data deluge. To translate this information into new knowledge that can guide the search for new medicines, biomedical researchers increasingly need to build on the existing knowledge of the broad community. Prior knowledge can help guide searches through the masses of new data. Unfortunately, most biomedical knowledge is represented solely in the text of journal articles. Given that more than a million such articles are published every year, the challenge of using this knowledge effectively is substantial. Ideally, knowledge such as the interrelations between genes, drugs and diseases would be represented in a knowledge graph that enabled queries like: “show me all the genes related to this disease or related to any drugs used to treat this disease”. Systems exist that attempt to extract this information automatically from text, but the quality of their output remains far below what can be obtained by human readers. We are developing a new platform that taps the language comprehension abilities of citizen scientists to help excavate a queryable knowledge graph from the biomedical literature. In proof-of-concept experiments, we have demonstrated that lay-people are capable of extracting meaningful information from complex biological text. The information extracted using this community intelligence framework can surpass the efforts of individual experts in quality while also offering the potential to achieve massive scale. In this presentation we will describe the results of early experiments and introduce our prototype citizen science platform: http://mark2cure.org.
Branch: An interactive, web-based tool for building decision tree classifiersBenjamin Good
A crucial task in modern biology is the prediction of complex phenotypes, such as breast cancer prognosis, from genome-wide measurements. Machine learning algorithms can sometimes infer predictive patterns, but there is rarely enough data to train and test them effectively and the patterns that they identify are often expressed in forms (e.g. support vector machines, neural networks, random forests composed of 10s of thousands of trees) that are highly difficult to understand. In addition, it is generally unclear how to include prior knowledge in the course of their construction.
Decision trees provide an intuitive visual form that can capture complex interactions between multiple variables. Effective methods exist for inferring decision trees automatically but it has been shown that these techniques can be improved upon via the manual interventions of experts. Here, we introduce Branch, a new Web-based tool for the interactive construction of decision trees from genomic datasets. Branch offers the ability to: (1) upload and share datasets intended for classification tasks (in progress), (2) construct decision trees by manually selecting features such as genes for a gene expression dataset, (3) collaboratively edit decision trees, (4) create feature functions that aggregate content from multiple independent features into single decision nodes (e.g. pathways) and (5) evaluate decision tree classifiers in terms of precision and recall. The tool is optimized for genomic use cases through the inclusion of gene and pathway-based search functions.
Branch enables expert biologists to easily engage directly with high-throughput datasets without the need for a team of bioinformaticians. The tree building process allows researchers to rapidly test hypotheses about interactions between biological variables and phenotypes in ways that would otherwise require extensive computational sophistication. In so doing, this tool can both inform biological research and help to produce more accurate, more meaningful classifiers.
A prototype of Branch is available at http://biobranch.org/
The Cure: Making a game of gene selection for breast cancer survival predictionBenjamin Good
Background: Molecular signatures for predicting breast cancer prognosis could greatly improve care through personalization of treatment. Computational analyses of genome-wide expression datasets have identified such signatures, but these signatures leave much to be desired in terms of accuracy, reproducibility and biological interpretability. Methods that take advantage of structured prior knowledge (e.g. protein interaction networks) show promise in helping to define better signatures but most knowledge remains unstructured. Crowdsourcing via scientific discovery games is an emerging methodology that has the potential to tap into human intelligence at scales and in modes previously unheard of.
Objective: The main objective of this study was to test the hypothesis that knowledge linking expression patterns of specific genes to breast cancer outcomes could be captured from players of an open, Web-based game. We envisioned capturing knowledge both from the player’s prior experience and from their ability to interpret text related to candidate genes presented to them in the context of the game.
Methods: We developed and evaluated an online game called “The Cure” that captured information from players regarding genes for use in predictors of breast cancer survival. Information gathered from game play was aggregated using a voting approach and used to create rankings of genes. The top genes from these rankings were evaluated using annotation enrichment analysis, comparison to prior predictor gene sets, and by using them to train and test machine learning systems for predicting 10-year survival.
Results: Between its launch in Sept. 2012 and Sept. 2013, The Cure attracted more than 1,000 registered players who collectively played nearly 10,000 games. Gene sets assembled through aggregation of the collected data showed significant enrichment for genes known to be related to key concepts such as Cancer, Disease Progression, and Recurrence (P < 1.1e-07). In terms of the accuracy of models trained using them, these gene sets provided comparable performance to gene sets generated using other methods including those used in commercial tests. The Cure is available at http://genegames.org/cure/
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...Benjamin Good
Benjamin M. Good, Max Nanis, Andrew I. Su
Identifying concepts and relationships in biomedical text enables knowledge to be applied in computational analyses that would otherwise be impossible. As a result, many biological natural language processing (BioNLP) projects attempt to address this challenge. However, the state of the art in BioNLP still leaves much room for improvement in terms of precision, recall and the complexity of knowledge structures that can be extracted automatically. Expert curators are vital to the process of knowledge extraction but are always in short supply. Recent studies have shown that workers on microtasking platforms such as Amazon’s Mechanical Turk (AMT) can, in aggregate, generate high-quality annotations of biomedical text.
Here, we investigated the use of the AMT in capturing disease mentions in Pubmed abstracts. We used the recently published NCBI Disease corpus as a gold standard for refining and benchmarking the crowdsourcing protocol. After merging the responses from 5 AMT workers per abstract with a simple voting scheme, we were able to achieve a maximum f measure of 0.815 (precision 0.823, recall 0.807) over 593 abstracts as compared to the NCBI annotations on the same abstracts. Comparisons were based on exact matches to annotation spans. The results can also be tuned to optimize for precision (max = 0.98 when recall = 0.23) or recall (max = 0.89 when precision = 0.45). It took 7 days and cost $192.90 to complete all 593 abstracts considered here (at $.06/abstract with 50 additional abstracts used for spam detection).
This experiment demonstrated that microtask-based crowdsourcing can be applied to the disease mention recognition problem in the text of biomedical research articles. The f-measure of 0.815 indicates that there is room for improvement in the crowdsourcing protocol but that, overall, AMT workers are clearly capable of performing this annotation task.
Microtask crowdsourcing for disease mention annotation in PubMed abstractsBenjamin Good
Microtask crowdsourcing for disease mention annotation in PubMed abstracts
Benjamin M. Good, Max Nanis, Andrew I. Su
Identifying concepts and relationships in biomedical text enables knowledge to be applied in computational analyses that would otherwise be impossible. As a result, many biological natural language processing (BioNLP) projects attempt to address this challenge. However, the state of the art in BioNLP still leaves much room for improvement in terms of precision, recall and the complexity of knowledge structures that can be extracted automatically. Expert curators are vital to the process of knowledge extraction but are always in short supply. Recent studies have shown that workers on microtasking platforms such as Amazon’s Mechanical Turk (AMT) can, in aggregate, generate high-quality annotations of biomedical text.
Here, we investigated the use of the AMT in capturing disease mentions in Pubmed abstracts. We used the recently published NCBI Disease corpus as a gold standard for refining and benchmarking the crowdsourcing protocol. After merging the responses from 5 AMT workers per abstract with a simple voting scheme, we were able to achieve a maximum f measure of 0.815 (precision 0.823, recall 0.807) over 593 abstracts as compared to the NCBI annotations on the same abstracts. Comparisons were based on exact matches to annotation spans. The results can also be tuned to optimize for precision (max = 0.98 when recall = 0.23) or recall (max = 0.89 when precision = 0.45). It took 7 days and cost $192.90 to complete all 593 abstracts considered here (at $.06/abstract with 50 additional abstracts used for spam detection).
This experiment demonstrated that microtask-based crowdsourcing can be applied to the disease mention recognition problem in the text of biomedical research articles. The f-measure of 0.815 indicates that there is room for improvement in the crowdsourcing protocol but that, overall, AMT workers are clearly capable of performing this annotation task.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
Bio Logical Mass Collaboration3
1. bioLogical
mass collaboration
Benjamin Good
University of British Columbia
Symposium on (Bio)semantics for complex
systems biology, Leiden University Medical Center
12 March 2009.
9. More data captured
http://upload.wikimedia.org/wikipedia/commons/c/c9/Hippocampus-mri.jpg
Resource Tagged
Tagging
Tagger
JaneTagger 2007-8-29
Event
Tagging Context
Associated Tags
hippocampus mri image wikipedia
10. Tags
• Not the same as either professionally
or automatically generated keywords.
- (Al-Khalifa & Davis 2007)
• Can be used to improve Web search
- (Morrison 2008)
11. Tagging in science?
• How does social tagging compare to
professional indexing in the life
sciences?
• (Good, Tennis, Wilkinson in
preparation)
12. “Tuned responses of astrocytes and their influence on
hemodynamic signals in the visual cortex”
16. open social tagging -
in science
➡ low numbers of tags per post
➡ low numbers of posts per document
➡ low value of tags as descriptors..
17. adding value to each tag
• social semantic tagging,
➡ tagging with encoded concepts
instead of strings of letters
➡ = the Entity Describer (E.D.)
Good, Kawas, Wilkinson (2007) Bridging the gap between social
tagging and semantic annotation. Nature Precedings
24. E.D. can be customized
• Tag with:
genes, gene ontology terms, terms from
OWL ontologies
• Recently used to conduct a successful
experiment in BioMoby Web service
annotation
25. but!
• Does not address the volume problem -
more participation is needed to make
social tagging a useful source of
bioLogical knowledge.
26. The plan for today
Mostly-manual strategies for creating
bioLogical knowledge
• pull
➡ social tagging
• push
➡ frames and games
27. push
• Key difference from pull model is
that system designers push specific
requests to users
• many incentive options:
financial, psychological...
28. Pushy pattern
1. design frame for knowledge to be
collected ?
? ?
2. choose incentive system
3. design interface
4. collect knowledge
5. aggregate knowledge
29. Mechanical Turk:
pushing with money
• A “marketplace for work”
hosted by Amazon Inc.
“artificial artificial
intelligence”
30. Mechanical Turk and
NLP
• Snow et al (2008)
- used workers on the AMT to label
text for use in training/testing NLP
algorithms.
- word sense disambiguation, affect
recognition and several more.
Snow et al (2008) Cheap and Fast—But is it Good? Evaluating Non-Expert Annotations for
Natural Language Tasks, In Empirical Methods in Natural Language Processing, p 254--263
31. Snow et al (2008) cont.
Results for affect recognition
• labels = 7000
• cost = $2
• time = 5.9 hours
• when aggregated, results equal or better
than expert labelers in most cases.
Snow et al (2008) Cheap and Fast—But is it Good? Evaluating Non-Expert Annotations for
Natural Language Tasks, In Empirical Methods in Natural Language Processing, p 254--263
32. ESP game, pushing with fun
Von Ahn and Dabbish (2004) Labeling Images with a Computer Game
http://www.cs.cmu.edu/~biglou/ESP.pdf
33. ESP game results (2004)
• >4 million images labeled
• >23,000 players
• Given 5,000 players online
simultaneously, could label all of the
images accessible to Google in a month
• (See the “Google image labeling
game”…)
34. iCAPTURer: assessing
push for bioLogical
knowledge
• Can we acquire bio-ontological
knowledge from untrained volunteers
in a scalable, Web-based manner?
• 2 experiments in the context of
scientific conferences
Good et al. 2006. Fast, cheap, and out of control: a zero-curation model for ontology development.
Good and Wilkinson 2007. Ontology engineering using volunteer labor
35. iCAPTURer 1
Goals
1. Identify concepts from text
2. Link concepts to synonyms and to
hyponyms (‘x is_a y’) rooted in the
UMLS Semantic Network
Good et al. 2006. Fast, cheap, and out of control: a zero-curation model for ontology development.
44. Initial acquisition verse
evaluation
11,000
Number of
assertions
gathered
1,000
Knowledge capture Evaluation conducted
at YI forum via email request
45. Initial acquisition verse
evaluation
11,000
“I assert that t cell “I agree that t cell
Number of
activation is a kind of activation is a kind of
assertions
immune response” immune response”
gathered
1,000
Knowledge capture Evaluation conducted
at YI forum via email request
• Multiple choice (voting)
• Forms
• Tree navigation
• Home setting
• Conference setting
• 3 days
• 2 days
• 68 people
• 65 people
46. iCAPTURer 2 pattern
1. Infer complete ontology
2. Present each edge as a multiple choice
question {true, false, I don’t know}
3. Aggregate votes to decide on each
triple
48. iCAPTURer2 results
1.2
• Same pattern of
1
fraction subclass judgments made
0.8
participation 0.6
0.4
• Only 66% correct 0.2
overall in 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Volunteer
assessing subClass
assertions
• highly biased
towards saying
‘yes’.
49. iCAPTURer summary
• Scientifically relevant tasks are harder
- the population pool is smaller, but - in
my experience generally very willing.
• Engaging the competitive instinct was
helpful in obtaining the responses we
did.
• Much room for further investigation.
51. Filling in Freebase with
Typewriter
? is a ?
X Y
http://typewriter.freebaseapps.com/
March 9, 2009
52. Filling in Freebase with
Typewriter
? is a ?
X Y
http://typewriter.freebaseapps.com/
March 9, 2009
53. To achieve mass collaborative bioLogical
knowledge assembly, make it possible for
people to contribute in multiple modes
- as creators
- as evaluators
- as system builders (open APIs are crucial)
and for multiple reasons
- personal information management
- fun, competition
- finance
R
X Y
X Y
X Y
X Y
56. “...how you envision future
developments...”
+
Automation Human computation
= increasingly high-throughput
bioLogical knowledge representation
57. “...how your own expertise would fit into
this realm...”
more
requires
bioLogical
knowledge representation
analyses
machine learning
knows a bit about community action
ben
http://biordf.net/~bgood/
58. Thanks to
• developers: Eddie Kawas, Paul Lu
• advisor: Mark Wilkinson
• Barend Mons for the invitation and
Marco Roos for the accommodation!
http://biordf.net/~bgood/