Automating the process of continuously prioritising data, updating and deploy...Ola Spjuth
Presentation at Data Innovation Summit 2019 in Stockholm, Sweden.
ABSTRACT
Microscopes are capable of producing vast amounts of data, and when used in automated laboratories both the number and size of images present many challenges for storing, categorizing, analyzing, annotating, and transforming the data into actionable information that can used for decision making; either by humans or machines. In this presentation I will describe the informatics system we have established at the Department of Pharmaceutical Biosciences at Uppsala University, which consists of computational hardware (CPUs, GPUs, storage), middleware (Kubernetes), imaging database (OMERO), and workflow system (Pachyderm) to perform online prioritization of new data, as well as the continuous analytics system to automate the process from captured images to continuously updated and deployed AI models. The AI methodologies include Deep Learning models trained on image data, and conventional machine learning models trained on features extracted from images or chemical structures. Due to the microservice architecture the system is scalable and can be expanded using hybrid-architectures with cloud computing resources. The informatics system serves a robotized cell profiling setup with incubators, liquid handling and high-content microscopy. The lab is quite young and is targeting applications primarily in drug screening and toxicity assessment, with the aim to improve research using AI and intelligent design of experiments.
Many of us data science and business analytics practitioners perform research and analysis for decision makers on a regular basis. The deliverable of such analysis often results in a Power Point presentation, and/or a model that needs to be productionalized. The code used to produce the analysis also needs to be considered a deliverable.
Many of us perform analysis without reproducibility in mind. With the increasing democratization of data, it is becoming more and more important for people that may not have scientific training to be able to create analysis that can be picked up by somebody else who can then reproduce your results. That, and creating reproducible research is just solid science.
We are going to spend an evening walking though the various tools available to create reproducible research on Big Data. You will get introduced to the Tidyverse of R packages and how to use them. We will discuss the ins and outs of various notebook technologies like Jupyter, and Zeppelin. You will have an opportunity to learn how to get up and running with R and Spark and the various options you have to learn on real clusters instead of just your local environment. There also be a quick introduction to source control and the various options you have around using Git.
The theme of the evening will be “getting started”. We will go over various training resources and show you the optimal path to go from zero to master. Some commentary will be provided around the current state of the job market and intel from the front lines of the data science language wars. This is a large topic and the evening will be fairly dynamic and responsive to the needs of the audience.
Bob Wakefield has spent the better part of 16 years building data systems for many organizations across various industries. He has been running Hadoop in a lab environment for 3 years. He is the principal of Mass Street Analytics, LLC a boutique data consultancy. Mass Street is a Hortonworks Consultant Partner and Confluent Partner.
In his spare time, he likes to work on an equity investment application that combines various sources of information to automatically arrive at investing decisions. When he is not doing that, you’ll find him flying his A-10 simulator. Full CV can be found here: https://www.linkedin.com/in/bobwakefieldmba/
Efficient Instant-Fuzzy Search with Proximity Ranking
System finds answers to a query instantly while user types in keywords character-by-character.
Fuzzy search improves user search experiences by finding relevant answers with keywords similar to query keywords.
A main computational challenge in this paradigm is the high speed requirement
At the same time, we also need good ranking functions that consider the proximity of keywords to compute relevance scores
The previous systems were able to recommend results based on just previously typed characters kept in cache module.
Most of the times Previous Search Log might be useful to make recommendation system more faster!
Relevance to user query along with users intentions could be mined easily.
Leveraging Analytics for Dynamic Review StrategiesIpro Tech
Uncover review strategies using Analytics to enhance dynamic workflows and increase document review speeds. This hands-on session will provide a glimpse into several of the available analytics technologies in Ipro for enterprise.
MR201402 effectiveness of unknown malware classification by logistic regressi...FFRI, Inc.
• Apply logistic regression analysis to static information of
executables and find out how detection rate and false positive are.
• Investigate how the tendency of these rates differs to another file set.
• Especially for detection rate, it is important to see how the features collected from malware in a specific span and in a span after that are different.
Tracking Citations to Research Software via PIDsETH-Bibliothek
Tracking citations to research software via persistent identifiers is difficult due to dilution of citations over many PIDs assigned to a software package. On top of this, software citations are often consistently being edited out by every actor part of the scholarly communication process such as reference managers, publishers, professors and discovery systems. Thus, the survival rate of a software citation is extremely low in the current scholarly ecosystem. The Sloan-funded Asclepias project is a collaboration between a publisher, discovery system and repository with the goal to promote scientific software into an identifiable, citable, and preservable object. We have built a citation broker that is currently tracking some 6.000 citations to Zenodo DOIs from NASA ADS,
CrossRef and EuropePMC.
Automating the process of continuously prioritising data, updating and deploy...Ola Spjuth
Presentation at Data Innovation Summit 2019 in Stockholm, Sweden.
ABSTRACT
Microscopes are capable of producing vast amounts of data, and when used in automated laboratories both the number and size of images present many challenges for storing, categorizing, analyzing, annotating, and transforming the data into actionable information that can used for decision making; either by humans or machines. In this presentation I will describe the informatics system we have established at the Department of Pharmaceutical Biosciences at Uppsala University, which consists of computational hardware (CPUs, GPUs, storage), middleware (Kubernetes), imaging database (OMERO), and workflow system (Pachyderm) to perform online prioritization of new data, as well as the continuous analytics system to automate the process from captured images to continuously updated and deployed AI models. The AI methodologies include Deep Learning models trained on image data, and conventional machine learning models trained on features extracted from images or chemical structures. Due to the microservice architecture the system is scalable and can be expanded using hybrid-architectures with cloud computing resources. The informatics system serves a robotized cell profiling setup with incubators, liquid handling and high-content microscopy. The lab is quite young and is targeting applications primarily in drug screening and toxicity assessment, with the aim to improve research using AI and intelligent design of experiments.
Many of us data science and business analytics practitioners perform research and analysis for decision makers on a regular basis. The deliverable of such analysis often results in a Power Point presentation, and/or a model that needs to be productionalized. The code used to produce the analysis also needs to be considered a deliverable.
Many of us perform analysis without reproducibility in mind. With the increasing democratization of data, it is becoming more and more important for people that may not have scientific training to be able to create analysis that can be picked up by somebody else who can then reproduce your results. That, and creating reproducible research is just solid science.
We are going to spend an evening walking though the various tools available to create reproducible research on Big Data. You will get introduced to the Tidyverse of R packages and how to use them. We will discuss the ins and outs of various notebook technologies like Jupyter, and Zeppelin. You will have an opportunity to learn how to get up and running with R and Spark and the various options you have to learn on real clusters instead of just your local environment. There also be a quick introduction to source control and the various options you have around using Git.
The theme of the evening will be “getting started”. We will go over various training resources and show you the optimal path to go from zero to master. Some commentary will be provided around the current state of the job market and intel from the front lines of the data science language wars. This is a large topic and the evening will be fairly dynamic and responsive to the needs of the audience.
Bob Wakefield has spent the better part of 16 years building data systems for many organizations across various industries. He has been running Hadoop in a lab environment for 3 years. He is the principal of Mass Street Analytics, LLC a boutique data consultancy. Mass Street is a Hortonworks Consultant Partner and Confluent Partner.
In his spare time, he likes to work on an equity investment application that combines various sources of information to automatically arrive at investing decisions. When he is not doing that, you’ll find him flying his A-10 simulator. Full CV can be found here: https://www.linkedin.com/in/bobwakefieldmba/
Efficient Instant-Fuzzy Search with Proximity Ranking
System finds answers to a query instantly while user types in keywords character-by-character.
Fuzzy search improves user search experiences by finding relevant answers with keywords similar to query keywords.
A main computational challenge in this paradigm is the high speed requirement
At the same time, we also need good ranking functions that consider the proximity of keywords to compute relevance scores
The previous systems were able to recommend results based on just previously typed characters kept in cache module.
Most of the times Previous Search Log might be useful to make recommendation system more faster!
Relevance to user query along with users intentions could be mined easily.
Leveraging Analytics for Dynamic Review StrategiesIpro Tech
Uncover review strategies using Analytics to enhance dynamic workflows and increase document review speeds. This hands-on session will provide a glimpse into several of the available analytics technologies in Ipro for enterprise.
MR201402 effectiveness of unknown malware classification by logistic regressi...FFRI, Inc.
• Apply logistic regression analysis to static information of
executables and find out how detection rate and false positive are.
• Investigate how the tendency of these rates differs to another file set.
• Especially for detection rate, it is important to see how the features collected from malware in a specific span and in a span after that are different.
Tracking Citations to Research Software via PIDsETH-Bibliothek
Tracking citations to research software via persistent identifiers is difficult due to dilution of citations over many PIDs assigned to a software package. On top of this, software citations are often consistently being edited out by every actor part of the scholarly communication process such as reference managers, publishers, professors and discovery systems. Thus, the survival rate of a software citation is extremely low in the current scholarly ecosystem. The Sloan-funded Asclepias project is a collaboration between a publisher, discovery system and repository with the goal to promote scientific software into an identifiable, citable, and preservable object. We have built a citation broker that is currently tracking some 6.000 citations to Zenodo DOIs from NASA ADS,
CrossRef and EuropePMC.
Rightsizing Open Source Software IdentificationnexB Inc.
Webinar recording available at the end of the slide deck.
Heather Meeker, partner at O'Melveny & Myers LLP and Philippe Ombredanne, founder at nexB Inc. discussed the latest open source software identification tools available for use in your compliance process.
Agenda
- Key Elements of a Policy for use of OSS
- Overview of OSS Identification
- Survey of open source and commercial tools for OSS Identification
- Rightsizing your OSS Identification Process and Tools.
If you are interested in open source scanning and open source compliance products, please visit http://www.nexb.com/, see also https://www.youtube.com/user/DejaCode/ for other webinar recordings.
Automatically Retrieving and Loading Data into Siebel CTMS from Multiple CRO ...Perficient, Inc.
With more and more sponsors moving towards a clinical outsourcing model, the need to obtain data from multiple CRO partners in a standard format and place it into a sponsor’s CTMS has increased. Automatically retrieving and loading data in XML format via a custom utility and integration could be a viable solution and enable sponsors to have an accurate snapshot of all their studies at any given time.
Please join Perficient’s clinical trial management and application development experts for a complimentary webinar that will feature a customer case study discussing:
The customer’s situation and objectives
Business reasons for using a custom utility and integration
Technical details and considerations
Lessons learned
Building an Open Source AppSec Pipeline - 2015 Texas Linux FestMatt Tesauro
Take the ideas of DevOps and the notion of a delivery pipeline and combine them for an AppSec Pipeline. This talk covers the open source components used to create an AppSec Pipeline and the benefits we received from its implementation.
PatSeer Premier edition is a complete professional patent research package comprising an online global patent database and research platform with integrated analytics, project workflow, and collaboration capabilities. PatSeer Premier quickly exceeds current systems in its analytics, team collaboration and data sharing capabilities.
Defending the Enterprise with Evernote at SourceBoston on May 27, 2015grecsl
Most people are already familiar with Evernote. It’s easy to just throw all our miscellaneous data into the Elephant and effortlessly find it later with a quick search or correlate similar ideas with tags. Evernote is literally our external brain that increases our intelligence and helps us become more productive overall. This presentation discusses an experiment of using Evernote as a defensive management platform, the specific concepts and strategies used, and its overall effectiveness. Specific topics covered will include the advantages of using an open and flexible platform that can be molded into an open/closed source threat intelligence database, an information sharing platform, and an incident case management system. Although using Evernote in this way in large enterprises is probably not possible, the same lessons learned can be applied to implement a similarly effective system using internally-hosted open source or commercial software.
Big City Metadata Tour - Pingar and Partners visit big cities to talk about how to get more value out of your SharePoint. John Peltonen, of 3Sharp, joined Owen for a joint presentation.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Rightsizing Open Source Software IdentificationnexB Inc.
Webinar recording available at the end of the slide deck.
Heather Meeker, partner at O'Melveny & Myers LLP and Philippe Ombredanne, founder at nexB Inc. discussed the latest open source software identification tools available for use in your compliance process.
Agenda
- Key Elements of a Policy for use of OSS
- Overview of OSS Identification
- Survey of open source and commercial tools for OSS Identification
- Rightsizing your OSS Identification Process and Tools.
If you are interested in open source scanning and open source compliance products, please visit http://www.nexb.com/, see also https://www.youtube.com/user/DejaCode/ for other webinar recordings.
Automatically Retrieving and Loading Data into Siebel CTMS from Multiple CRO ...Perficient, Inc.
With more and more sponsors moving towards a clinical outsourcing model, the need to obtain data from multiple CRO partners in a standard format and place it into a sponsor’s CTMS has increased. Automatically retrieving and loading data in XML format via a custom utility and integration could be a viable solution and enable sponsors to have an accurate snapshot of all their studies at any given time.
Please join Perficient’s clinical trial management and application development experts for a complimentary webinar that will feature a customer case study discussing:
The customer’s situation and objectives
Business reasons for using a custom utility and integration
Technical details and considerations
Lessons learned
Building an Open Source AppSec Pipeline - 2015 Texas Linux FestMatt Tesauro
Take the ideas of DevOps and the notion of a delivery pipeline and combine them for an AppSec Pipeline. This talk covers the open source components used to create an AppSec Pipeline and the benefits we received from its implementation.
PatSeer Premier edition is a complete professional patent research package comprising an online global patent database and research platform with integrated analytics, project workflow, and collaboration capabilities. PatSeer Premier quickly exceeds current systems in its analytics, team collaboration and data sharing capabilities.
Defending the Enterprise with Evernote at SourceBoston on May 27, 2015grecsl
Most people are already familiar with Evernote. It’s easy to just throw all our miscellaneous data into the Elephant and effortlessly find it later with a quick search or correlate similar ideas with tags. Evernote is literally our external brain that increases our intelligence and helps us become more productive overall. This presentation discusses an experiment of using Evernote as a defensive management platform, the specific concepts and strategies used, and its overall effectiveness. Specific topics covered will include the advantages of using an open and flexible platform that can be molded into an open/closed source threat intelligence database, an information sharing platform, and an incident case management system. Although using Evernote in this way in large enterprises is probably not possible, the same lessons learned can be applied to implement a similarly effective system using internally-hosted open source or commercial software.
Big City Metadata Tour - Pingar and Partners visit big cities to talk about how to get more value out of your SharePoint. John Peltonen, of 3Sharp, joined Owen for a joint presentation.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
2. Most of the investigation procedure puts up the requirement of
investigation of Outlook Data File i.e. PST file without having
Outlook installed on the target machine.
• Traverse PST without Outlook!
• Search for evidence inside PST among thousands of emails!
• Examine both ANSI and Unicode formats of PST!
• Categorize and examine attachments!
• Present a structured and genuine investigation report!
Forensic Investigation of
PST File without OutlookChallenge
Expert View
Records
Solution
How To
3. Increase In The Demand
For an Independent
Outlook Forensic Solution
Challenge
Expert View
Records
Solution
How To
2007—2015
87%
6. Solution continued..
Challenge
Expert View
Records
Solution
How To
• Use Search Filters available in the software to search
email evidence.
• The available search algorithms are General,
PreDefined, Advance &
Proximity.
• Use various search criteria such as To, From, Subject,
Date, etc.
•Use combination of the logical operators i.e., AND, OR
and NOT.
• Get the evidences bookmarked as well as add necessary
tags.
7. To learn the step by step
procedure, please visit: -
http://www.mailxaminer.com
/blog/search-inside-pst-
file-without-outlook/
Search Inside PST without
Outlook Using MailXaminerChallenge
Expert View
Records
How To
Solution