The histo-pathological analysis of tissue sections is the gold standard to assess the presence of many complex diseases, such as tumors and it is expected to be at the center of the AI revolution in medicine, prevision supported by the increasing success of deep learning applications to digital pathology. The aim of histolab is to provide a tool for Whole Slide Images (WSIs) processing in a reproducible environment to support clinical and scientific research. histolab is designed to handle WSIs, automatically detect the tissue, and retrieve informative tiles.
Next generation sequencing: research opportunities and bioinformatic challenges. A seminar I gave for the Computational Life Science (Univ. of Oslo) seminar series, March 2, 2011
Next generation sequencing: research opportunities and bioinformatic challenges. A seminar I gave for the Computational Life Science (Univ. of Oslo) seminar series, March 2, 2011
AI for All: Biology is eating the world & AI is eating Biology Intel® Software
Advances in cell biology and creation of an immense amount of data are converging with advances in Machine learning to analyze this data. Biology is experiencing its AI moment and driving the massive computation involved in understanding biological mechanisms and driving interventions. Learn about how cutting edge technologies such as Software Guard Extensions (SGX) in the latest Intel Xeon Processors and Open Federated Learning (OpenFL), an open framework for federated learning developed by Intel, are helping advance AI in gene therapy, drug design, disease identification and more.
The PDX Splunk community came together for a fantastic in-person Splunk PNW User Group at Steeplejack Brewing Company in PDX! We had a great Detection Engineering walkthrough and demo from our sponsor Anvilogic, and Arcus Data gave a wonderful demo of both Edge Hub and AI Assist. See you again soon!
Growing plants and vegetables is not only a great hobby, it is also a healthy and sustainable way of obtaining food and medicine. Nowadays, not every person has access to a piece of land nor the time to take care of crops. This situation intensifies in urban areas, where most people live in small spaces and have busy lifestyles.
The aim of this project is to build an automated grow box to allow users to produce plants and vegetables indoors in a time-saving and inexpensive way. In order to achieve this, the system must provide everything that plants need to grow healthy, such as nutrients, air, water, light, temperature and space.
This report covers the whole process that has been undertaken to complete and reflect on the project, classified in the following main sections: introduction, investigation, design, development and evaluation.
Due to the nature of this project, research is not only limited to the area of Software Engineering.
This project includes two deliverables: A plastic container with electric components attached where the plants can grow and a front-end application that allows the end user to interact with and monitor the system.
Scott Edmunds @ Balti & Bioinformatics: New Models in Open Data Publishing. January 21st 2015. Video archive https://plus.google.com/u/0/events/cbtuikle0h2619obgjrgfu74424
Using Docker Containers to Improve Reproducibility in Software and Web Engine...Vincenzo Ferme
The ability to replicate and reproduce scientific results has become an increasingly important topic for many academic disciplines. In computer science and, more specifically, software and Web engineering, contributions of scientific work rely on developed algorithms, tools and prototypes, quantitative evaluations, and other computational analyses. Published code and data come with many undocumented assumptions, dependencies, and configurations that are internal knowledge and make reproducibility hard to achieve. This tutorial presents how Docker containers can overcome these issues and aid the reproducibility of research artefacts in software engineering and discusses their applications in the field.
Cite us: http://link.springer.com/chapter/10.1007/978-3-319-38791-8_58
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioAlluxio, Inc.
Alluxio Global Online Meetup
Apr 23, 2020
For more Alluxio events: https://www.alluxio.io/events/
Speakers:
Jiao (Jennie) Wang, Intel
Tsai Louie, Intel
Bin Fan, Alluxio
Today, many people run deep learning applications with training data from separate storage such as object storage or remote data centers. This presentation will demo the Intel Analytics Zoo + Alluxio stack, an architecture that enables high performance while keeping cost and resource efficiency balanced without network being I/O bottlenecked.
Intel Analytics Zoo is a unified data analytics and AI platform open-sourced by Intel. It seamlessly unites TensorFlow, Keras, PyTorch, Spark, Flink, and Ray programs into an integrated pipeline, which can transparently scale from a laptop to large clusters to process production big data. Alluxio, as an open-source data orchestration layer, accelerates data loading and processing in Analytics Zoo deep learning applications.
This talk, we will go over:
- What is Analytics Zoo and how it works
- How to run Analytics Zoo with Alluxio in deep learning applications
- Initial performance benchmark results using the Analytics Zoo + Alluxio stack
The goal of this report is the presentation of our biometry and security course’s project: Face recognition for Labeled Faces in the Wild dataset using Convolutional Neural Network technology with Graphlab Framework.
Keynote on software sustainability given at the 2nd Annual Netherlands eScience Symposium, November 2014.
Based on the article
Carole Goble ,
Better Software, Better Research
Issue No.05 - Sept.-Oct. (2014 vol.18)
pp: 4-8
IEEE Computer Society
http://www.computer.org/csdl/mags/ic/2014/05/mic2014050004.pdf
http://doi.ieeecomputersociety.org/10.1109/MIC.2014.88
http://www.software.ac.uk/resources/publications/better-software-better-research
Recommender systems support the decision making processes of customers with personalized suggestions. These widely used systems influence the daily life of almost everyone across domains like ecommerce, social media, and entertainment. However, the efficient generation of relevant recommendations in large-scale systems is a very complex task. In order to provide personalization, engines and algorithms need to capture users’ varying tastes and find mostly nonlinear dependencies between them and a multitude of items. Enormous data sparsity and ambitious real-time requirements further complicate this challenge. At the same time, deep learning has been proven to solve complex tasks like object or speech recognition where traditional machine learning failed or showed mediocre performance.
Explore a use case for vehicle recommendations at mobile.de, Germany’s biggest online vehicle market. Marcel shares a novel regularization technique for the optimization criterion and evaluates it against various baselines. To achieve high scalability, he combines this method with strategies for efficient candidate generation based on user and item embeddings—providing a holistic solution for candidate generation and ranking.
The proposed approach outperforms collaborative filtering and hybrid collaborative-content-based filtering by 73% and 143% for MAP@5. It also scales well for millions of items and users returning recommendations in tens of milliseconds.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
More Related Content
Similar to Histolab: an Open Source Python Library for Reproducible Digital Pathology
AI for All: Biology is eating the world & AI is eating Biology Intel® Software
Advances in cell biology and creation of an immense amount of data are converging with advances in Machine learning to analyze this data. Biology is experiencing its AI moment and driving the massive computation involved in understanding biological mechanisms and driving interventions. Learn about how cutting edge technologies such as Software Guard Extensions (SGX) in the latest Intel Xeon Processors and Open Federated Learning (OpenFL), an open framework for federated learning developed by Intel, are helping advance AI in gene therapy, drug design, disease identification and more.
The PDX Splunk community came together for a fantastic in-person Splunk PNW User Group at Steeplejack Brewing Company in PDX! We had a great Detection Engineering walkthrough and demo from our sponsor Anvilogic, and Arcus Data gave a wonderful demo of both Edge Hub and AI Assist. See you again soon!
Growing plants and vegetables is not only a great hobby, it is also a healthy and sustainable way of obtaining food and medicine. Nowadays, not every person has access to a piece of land nor the time to take care of crops. This situation intensifies in urban areas, where most people live in small spaces and have busy lifestyles.
The aim of this project is to build an automated grow box to allow users to produce plants and vegetables indoors in a time-saving and inexpensive way. In order to achieve this, the system must provide everything that plants need to grow healthy, such as nutrients, air, water, light, temperature and space.
This report covers the whole process that has been undertaken to complete and reflect on the project, classified in the following main sections: introduction, investigation, design, development and evaluation.
Due to the nature of this project, research is not only limited to the area of Software Engineering.
This project includes two deliverables: A plastic container with electric components attached where the plants can grow and a front-end application that allows the end user to interact with and monitor the system.
Scott Edmunds @ Balti & Bioinformatics: New Models in Open Data Publishing. January 21st 2015. Video archive https://plus.google.com/u/0/events/cbtuikle0h2619obgjrgfu74424
Using Docker Containers to Improve Reproducibility in Software and Web Engine...Vincenzo Ferme
The ability to replicate and reproduce scientific results has become an increasingly important topic for many academic disciplines. In computer science and, more specifically, software and Web engineering, contributions of scientific work rely on developed algorithms, tools and prototypes, quantitative evaluations, and other computational analyses. Published code and data come with many undocumented assumptions, dependencies, and configurations that are internal knowledge and make reproducibility hard to achieve. This tutorial presents how Docker containers can overcome these issues and aid the reproducibility of research artefacts in software engineering and discusses their applications in the field.
Cite us: http://link.springer.com/chapter/10.1007/978-3-319-38791-8_58
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioAlluxio, Inc.
Alluxio Global Online Meetup
Apr 23, 2020
For more Alluxio events: https://www.alluxio.io/events/
Speakers:
Jiao (Jennie) Wang, Intel
Tsai Louie, Intel
Bin Fan, Alluxio
Today, many people run deep learning applications with training data from separate storage such as object storage or remote data centers. This presentation will demo the Intel Analytics Zoo + Alluxio stack, an architecture that enables high performance while keeping cost and resource efficiency balanced without network being I/O bottlenecked.
Intel Analytics Zoo is a unified data analytics and AI platform open-sourced by Intel. It seamlessly unites TensorFlow, Keras, PyTorch, Spark, Flink, and Ray programs into an integrated pipeline, which can transparently scale from a laptop to large clusters to process production big data. Alluxio, as an open-source data orchestration layer, accelerates data loading and processing in Analytics Zoo deep learning applications.
This talk, we will go over:
- What is Analytics Zoo and how it works
- How to run Analytics Zoo with Alluxio in deep learning applications
- Initial performance benchmark results using the Analytics Zoo + Alluxio stack
The goal of this report is the presentation of our biometry and security course’s project: Face recognition for Labeled Faces in the Wild dataset using Convolutional Neural Network technology with Graphlab Framework.
Keynote on software sustainability given at the 2nd Annual Netherlands eScience Symposium, November 2014.
Based on the article
Carole Goble ,
Better Software, Better Research
Issue No.05 - Sept.-Oct. (2014 vol.18)
pp: 4-8
IEEE Computer Society
http://www.computer.org/csdl/mags/ic/2014/05/mic2014050004.pdf
http://doi.ieeecomputersociety.org/10.1109/MIC.2014.88
http://www.software.ac.uk/resources/publications/better-software-better-research
Recommender systems support the decision making processes of customers with personalized suggestions. These widely used systems influence the daily life of almost everyone across domains like ecommerce, social media, and entertainment. However, the efficient generation of relevant recommendations in large-scale systems is a very complex task. In order to provide personalization, engines and algorithms need to capture users’ varying tastes and find mostly nonlinear dependencies between them and a multitude of items. Enormous data sparsity and ambitious real-time requirements further complicate this challenge. At the same time, deep learning has been proven to solve complex tasks like object or speech recognition where traditional machine learning failed or showed mediocre performance.
Explore a use case for vehicle recommendations at mobile.de, Germany’s biggest online vehicle market. Marcel shares a novel regularization technique for the optimization criterion and evaluates it against various baselines. To achieve high scalability, he combines this method with strategies for efficient candidate generation based on user and item embeddings—providing a holistic solution for candidate generation and ranking.
The proposed approach outperforms collaborative filtering and hybrid collaborative-content-based filtering by 73% and 143% for MAP@5. It also scales well for millions of items and users returning recommendations in tens of milliseconds.
Similar to Histolab: an Open Source Python Library for Reproducible Digital Pathology (20)
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Histolab: an Open Source Python Library for Reproducible Digital Pathology
1. : an Open Source Python Library
for Reproducible Digital Pathology
9-10-11 Novembre, 2021
Ernesto Arbitrio
ernesto.arbitrio@gmail.com
@__pamaron__
Alessia Marcolini
alessia.marcolini@hk3lab.ai
@viperale
2. 2
Alessia Ernesto
Data Science M.Sc. @ TU Eindhoven / TU Berlin
Junior Data Scientist @ HK3lab
PyCon Italia Organizer
Senior Backend Engineer @ YouGov PLC
PyCon Italia Organizer
Open Source Contributor
3. 3
I am so excited to work
on this new digital
pathology project!!
day 1
me!
4. Histopathology
Primary diagnostic resource for
the identification of complex
diseases, in particular of tumors
https://www.poliambulanza.it/dipartimenti/dipartimento-di-oncologia/anatomia-patologica
Section of Pathology and Tumour Biology, University of Leeds
Digital Pathology
Scanning of histopathological glass
slides to create Whole Slide Images
4
5. 5
I am so excited to work
on this new digital
pathology project!!
Let’s find some
literature first!
day 1
11. Whole Slide Images
Multi-resolution image (e.g. 5× and 20×)
Pyramidal format
Up to 90,000px × 30,000px
Very large in size up to 10GB
Scanner vendor specific file format
e.g. .svs, .vms, .ndpi, .tif, .bif, .scn
Ad-hoc software
for viewing and processing
Artifacts like shadows,
mold, pen marks
Image from Y. Wang et al. 2012.
SurfaceSlide: A multitouch digital pathology platform
10.1371/journal.pone.0030783
11
12. Whole Slide Images
Multi-resolution image (e.g. 5× and 20×)
Pyramidal format
Up to 90,000px × 30,000px
Very large in size up to 10GB
Scanner vendor specific file format
e.g. .svs, .vms, .ndpi, .tif, .bif, .scn
Ad-hoc software
for viewing and processing
Artifacts like shadows,
mold, pen marks
Image from Y. Wang et al. 2012.
SurfaceSlide: A multitouch digital pathology platform
10.1371/journal.pone.0030783
11
https://twitter.com/tdmckee/status/1456585006982340611?s=21
13. Whole Slide Images
Multi-resolution image (e.g. 5× and 20×)
Pyramidal format
Up to 90,000px × 30,000px
Very large in size up to 10GB
Scanner vendor specific file format
e.g. .svs, .vms, .ndpi, .tif, .bif, .scn
Ad-hoc software
for viewing and processing
Artifacts like shadows,
mold, pen marks
Image from Y. Wang et al. 2012.
SurfaceSlide: A multitouch digital pathology platform
10.1371/journal.pone.0030783
11
14. why?
Using WSIs directly as input to DL is unfeasible
Preprocessing to create smaller subwindows ("tiles") is required
Preprocessing steps usually poorly detailed in research papers
Leading to results that are hard to reproduce
Need for a reference high quality preprocessing software
To enable faster prototyping and faster experimentation
12
15. why?
Using WSIs directly as input to DL is unfeasible
Preprocessing to create smaller subwindows ("tiles") is required
Preprocessing steps usually poorly detailed in research papers
Leading to results that are hard to reproduce
Need for a reference high quality preprocessing software
To enable faster prototyping and faster experimentation
12
16. why?
Using WSIs directly as input to DL is unfeasible
Preprocessing to create smaller subwindows ("tiles") is required
Preprocessing steps usually poorly detailed in research papers
Leading to results that are hard to reproduce
Need for a reference high quality preprocessing software
To enable faster prototyping and faster experimentation
12
https://paperswithcode.com/
17. why?
Using WSIs directly as input to DL is unfeasible
Preprocessing to create smaller subwindows ("tiles") is required
Preprocessing steps usually poorly detailed in research papers
Leading to results that are hard to reproduce
Need for a reference high quality preprocessing software
To enable faster prototyping and faster experimentation
12
https://www.paperswithoutcode.com/
https://paperswithcode.com/
18. why?
Using WSIs directly as input to DL is unfeasible
Preprocessing to create smaller subwindows ("tiles") is required
Preprocessing steps usually poorly detailed in research papers
Leading to results that are hard to reproduce
Need for a reference high quality preprocessing software
To enable faster prototyping and faster experimentation
12
19. why?
Using WSIs directly as input to DL is unfeasible
Preprocessing to create smaller subwindows ("tiles") is required
Preprocessing steps usually poorly detailed in research papers
Leading to results that are hard to reproduce
Need for a reference high quality preprocessing software
To enable faster prototyping and faster experimentation
12
https://twitter.com/alexkyllo/status/1457072262520004632
20. why?
Using WSIs directly as input to DL is unfeasible
Preprocessing to create smaller subwindows ("tiles") is required
Preprocessing steps usually poorly detailed in research papers
Leading to results that are hard to reproduce
Need for a reference high quality preprocessing software
To enable faster prototyping and faster experimentation
12
https://twitter.com/alexkyllo/status/1457072262520004632
https://twitter.com/petebankhead/status/1407630531290927105?s=21
22. new open source Python package for
reproducible Whole Slide Images preprocessing
aimed at an easy
integration with a Deep Learning pipeline
14
23. unifying community-validated procedures
for slide preprocessing and tiles extraction
introducing best practices from software
engineering: automated testing, code versioning
and code reviews, Continuous Integration
on top of state-of-the-art and well-known libraries,
e.g. OpenSlide, NumPy and scikit-image
approach
15
26. Histolab features
Interoperability
between different
formats
up to 9 supported
formats from the
major scanner
vendors
#1
Automatic tissue
detection and
segmentation
by using a fixed
sequence of image
filters
#2
Automatic
informative tiles
retrieval
cropped regions
from tissue areas
found in #2
#3
16
27. Histolab features
Interoperability
between different
formats
up to 9 supported
formats from the
major scanner
vendors
#1
Automatic tissue
detection and
segmentation
by using a fixed
sequence of image
filters
#2
Automatic
informative tiles
retrieval
cropped regions
from tissue areas
found in #2
#3
Easy access to
sample data
from TCGA and
OpenSlide
save to the system
cache and import
them
#4
16
28. Histolab in action
WSI Image dataset (tiles) DL pipeline
Prostate Cancer Sample, TCGA-PRAD
16,000px × 15,316px
Magnification 5×
512px × 512px
Magnification 20×
512px × 512px
17
29. Tiles extraction #3
in less than 10 lines of code
>>> from histolab.data import breast_tissue
>>> _, path = breast_tissue()
>>> from histolab.slide import Slide
>>> slide = Slide(path, "path/to/processed")
>>> from histolab.tiler import RandomTiler
>>> random_tiles_extractor = RandomTiler(
tile_size=(512, 512),
n_tiles=10,
level=2,
seed=42,
check_tissue=True,
)
>>> random_tiles_extractor.extract(slide)
1. download breast
tissue sample from
TCGA
2. create a Slide object
3. create a Tiler
4. extract!
18
30. Tissue detection and tiles extraction
Original WSI
Tissue
Detection
Random Tiles
Breast Cancer Sample, TCGA-BRCA
96,972px × 30,682px
Magnification 20×
512px × 512px
19
RandomTiler
31. Tissue detection and tiles extraction
20
ScoreTiler
>>> from histolab.scorer import NucleiScorer
>>> scorer = NucleiScorer()
>>> from histolab.tiler import ScoreTiler
>>> scored_tiles_extractor = ScoreTiler(
scorer,
tile_size=(512, 512),
n_tiles=10,
level=2,
seed=42,
check_tissue=True,
)
>>> scored_tiles_extractor.extract(slide)
Representation of the score assigned to each
extracted tile by the NucleiScorer.
Ovarian Cancer Sample, TCGA-OV
30,001px × 33,987px
Nuclei Mask
512px × 512px tile
32. Original image 1. Grayscale filter 2. Otsu threshold 3. Binary dilation
4. Remove small holes 5. Remove small objects 6. Final mask 7. Biggest Tissue Area Box
Tissue detection #2
by using this fixed sequence of image filters Nobuyuki Otsu 1979.
A threshold selection method from gray-level histograms
10.1109/TSMC.1979.4310076
21
33. >>> from histolab.filters.image_filters import Compose, OtsuThreshold, RgbToGrayscale
>>> from histolab.filters.morphological_filters import (
BinaryDilation,
RemoveSmallHoles,
RemoveSmallObjects,
)
>>> filters = Compose(
[
RgbToGrayscale(),
OtsuThreshold(),
BinaryDilation(),
RemoveSmallHoles(),
RemoveSmallObjects(),
]
)
>>> filters(image)
Original image 1. Grayscale filter 2. Otsu threshold 3. Binary dilation
4. Remove small holes 5. Remove small objects 6. Final mask 7. Biggest Tissue Area Box
Nobuyuki Otsu 1979.
A threshold selection method from gray-level histograms
10.1109/TSMC.1979.4310076
22
Tissue detection #2
34. Remove artifacts
pen markers
Original image 1. Grayscale filter 2. Otsu threshold
3. Apply Mask 4. Green Pen Filter
Pen Filters implementation inspired by
https://github.com/CODAIT/deep-histopath
23
45. 33
You will spend more time writing tests than code
but at least you will sleep tight at night
100% coverage doesn’t imply 0% bugs
but stupid mistakes are easily caught
Code formatting and linting is nice
but automatize it to focus on the important stuff
Lessons learned
and notes for future self
46. Histolab is a joint work with
Nicole Bussola, PhD student
@ FBK-MPBA / CIBIO
Thank you
any question?
Alessia Marcolini Ernesto Arbitrio
alessia.marcolini@hk3lab.ai ernesto.arbitrio@gmail.com
@viperale @__pamaron__