This document summarizes several projects related to microscopy analysis of blood samples:
1. The Slide Staining Project aims to determine the optimal staining procedure and storage conditions for stained blood slides to maximize staining intensity over time.
2. The Color Coding Project develops a MATLAB program to automatically color code microsatellite data from different samples and loci to more easily identify significant differences.
3. The Automated Parasite Density Calculation project creates an ImageJ plugin to automatically count parasites and white blood cells in microscope images and calculate parasite density, reducing manual counting tasks.
4. The Parasitemia Counting Analysis discusses methods for using reticles to estimate total red blood cell counts from a smaller area
This document discusses fecal parasite screening tests commonly performed in veterinary medicine. It describes the various types of tests including gross examination, direct smear, and concentration techniques like fecal flotation and sedimentation. Concentration techniques allow a large sample to be examined efficiently by relying on the specific gravity of parasite eggs and oocysts to float to the top of solutions with higher specific gravities than water. The document outlines the steps to perform a centrifugal flotation, which is the most efficient test and involves making a fecal solution, centrifuging the sample, and examining the coverslip under a microscope.
This document describes procedures for examining fecal samples to detect parasite eggs under a microscope. Several methods are discussed, including direct smear, McMaster technique, simple floatation, and sedimentation. Using these methods on samples from sheep, cattle, and dogs, several parasite eggs were observed, including Ancylostoma caninum, Toxocara canis, Fasciola eggs, Trichuris eggs, and Dipylidium caninum eggs enclosed in a capsule. The document concludes that multiple examination methods are needed to thoroughly detect parasites at different infection levels in fecal samples.
This document summarizes various laboratory techniques for diagnosing parasitic infections through direct examination of samples like urine, stool, sputum, biopsy specimens, and aspirates, as well as indirect immunological methods and molecular biological techniques. Direct examination involves microscopic evaluation of samples for parasite eggs, larvae, cysts, trophozoites, or adult parasites, while concentration techniques help find parasites in low-density infections. Indirect methods detect antibodies to parasites, and molecular techniques like PCR can identify parasitic DNA in samples.
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...IRJET Journal
This document presents a method for detecting hookworm infection and ulcers in wireless capsule endoscopy images using saliency-based segmentation. The proposed method uses multi-level superpixel segmentation followed by feature extraction of color and texture properties. A particle swarm optimization algorithm is then used to classify images as healthy or infected/ulcerous based on the extracted features. Experimental results on capsule endoscopy images demonstrate the effectiveness of the proposed method at automatically detecting abnormalities in an efficient and non-invasive manner.
1) The document presents a method for near real-time early detection of esophageal cancer from endoscopic images using graphics processing and techniques like discrete wavelet transform and fractal dimension decomposition.
2) The method achieves detection of suspected cancerous regions in under 1 second per image, significantly beating the previous goal of under 10 seconds and obliterating the previous best time of 3 minutes.
3) Future work is proposed to improve the high false positive rates using support vector machines and analyzing thresholds more intelligently with GPU processing.
IRJET-Automatic RBC And WBC Counting using Watershed Segmentation AlgorithmIRJET Journal
This document presents a method for automatically counting red blood cells (RBCs) and white blood cells (WBCs) using image processing techniques. It discusses the limitations of conventional manual counting methods and proposes a software-based watershed segmentation algorithm to segment and count blood cells from microscope images. The algorithm involves preprocessing the image, applying filters, segmenting cells using markers and boundaries, and counting the segmented cells. Experimental results found the automatic method took 14.43 seconds on average and achieved 94.58% accuracy, faster and more accurate than manual counting. This software-based solution provides a low-cost alternative for blood cell analysis in medical laboratories.
This document discusses computer aided detection (CAD) of abnormalities in medical images. It begins by outlining CAD and some of the key machine learning challenges, including correlated training data, non-standard evaluation metrics, runtime constraints, lack of objective ground truths, and data shortages. It then describes solutions like multiple instance learning, batch classification, cascaded classifiers, crowdsourcing algorithms, and multi-task learning. The document concludes by reviewing the clinical impact of CAD systems through several independent studies, which demonstrated improved radiologist performance and sensitivity in detecting diseases.
This document discusses fecal parasite screening tests commonly performed in veterinary medicine. It describes the various types of tests including gross examination, direct smear, and concentration techniques like fecal flotation and sedimentation. Concentration techniques allow a large sample to be examined efficiently by relying on the specific gravity of parasite eggs and oocysts to float to the top of solutions with higher specific gravities than water. The document outlines the steps to perform a centrifugal flotation, which is the most efficient test and involves making a fecal solution, centrifuging the sample, and examining the coverslip under a microscope.
This document describes procedures for examining fecal samples to detect parasite eggs under a microscope. Several methods are discussed, including direct smear, McMaster technique, simple floatation, and sedimentation. Using these methods on samples from sheep, cattle, and dogs, several parasite eggs were observed, including Ancylostoma caninum, Toxocara canis, Fasciola eggs, Trichuris eggs, and Dipylidium caninum eggs enclosed in a capsule. The document concludes that multiple examination methods are needed to thoroughly detect parasites at different infection levels in fecal samples.
This document summarizes various laboratory techniques for diagnosing parasitic infections through direct examination of samples like urine, stool, sputum, biopsy specimens, and aspirates, as well as indirect immunological methods and molecular biological techniques. Direct examination involves microscopic evaluation of samples for parasite eggs, larvae, cysts, trophozoites, or adult parasites, while concentration techniques help find parasites in low-density infections. Indirect methods detect antibodies to parasites, and molecular techniques like PCR can identify parasitic DNA in samples.
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...IRJET Journal
This document presents a method for detecting hookworm infection and ulcers in wireless capsule endoscopy images using saliency-based segmentation. The proposed method uses multi-level superpixel segmentation followed by feature extraction of color and texture properties. A particle swarm optimization algorithm is then used to classify images as healthy or infected/ulcerous based on the extracted features. Experimental results on capsule endoscopy images demonstrate the effectiveness of the proposed method at automatically detecting abnormalities in an efficient and non-invasive manner.
1) The document presents a method for near real-time early detection of esophageal cancer from endoscopic images using graphics processing and techniques like discrete wavelet transform and fractal dimension decomposition.
2) The method achieves detection of suspected cancerous regions in under 1 second per image, significantly beating the previous goal of under 10 seconds and obliterating the previous best time of 3 minutes.
3) Future work is proposed to improve the high false positive rates using support vector machines and analyzing thresholds more intelligently with GPU processing.
IRJET-Automatic RBC And WBC Counting using Watershed Segmentation AlgorithmIRJET Journal
This document presents a method for automatically counting red blood cells (RBCs) and white blood cells (WBCs) using image processing techniques. It discusses the limitations of conventional manual counting methods and proposes a software-based watershed segmentation algorithm to segment and count blood cells from microscope images. The algorithm involves preprocessing the image, applying filters, segmenting cells using markers and boundaries, and counting the segmented cells. Experimental results found the automatic method took 14.43 seconds on average and achieved 94.58% accuracy, faster and more accurate than manual counting. This software-based solution provides a low-cost alternative for blood cell analysis in medical laboratories.
This document discusses computer aided detection (CAD) of abnormalities in medical images. It begins by outlining CAD and some of the key machine learning challenges, including correlated training data, non-standard evaluation metrics, runtime constraints, lack of objective ground truths, and data shortages. It then describes solutions like multiple instance learning, batch classification, cascaded classifiers, crowdsourcing algorithms, and multi-task learning. The document concludes by reviewing the clinical impact of CAD systems through several independent studies, which demonstrated improved radiologist performance and sensitivity in detecting diseases.
This document summarizes an application for automating quality assurance (QA) testing of positron emission tomography (PET) and single-photon emission computed tomography (SPECT) cameras. The application defines a template using a phantom mask, finds the best slice of a PET image to apply the mask to, fits the mask to the image slice, and calculates QA values and generates a report. It was developed for two GE camera models and uses image processing techniques like Hough transforms and contrast adjustment to select the best slice and identify regions of interest in the phantom image for analysis. The goals were to reduce time spent on manual QA testing and ensure results are consistent. The document discusses the GUI, classes, algorithms, and testing of the application
The document discusses three main problems with de novo assembly of next generation sequencing data and proposes solutions. The three problems are 1) large memory and compute requirements for assembly, 2) complexity of the assembly process and lack of standardized protocols, and 3) limited training opportunities that are difficult for students. The proposed solutions are standardized assembly protocols called khmer-protocols that provide copy-paste workflows for mRNAseq and metagenome assembly using techniques like digital normalization to reduce memory usage and make assembly scalable. The khmer-protocols are designed to be open, versioned, and reproducible to generate initial assembly results cheaply and easily in the cloud.
PREDICTION BASED LOSSLESS COMPRESSION SCHEME FOR BAYER COLOUR FILTER ARRAY IM...ijiert bestjournal
This paper presents an experimental evaluation of t he effectiveness of various techniques for lossless compression of CFA images. A colour image requires at least three colour samples at each pixel location. A digital camera would need th ree separate sensors to completely measure the image. In a three chip colour camera,the light entering the camera is split and projected onto each spectral sensor. Each sensor requires its prop er driving electronics,and the sensors have to be registered precisely. These additional requireme nts add a large expense to the system. Thus most commercial digital cameras use colour filterar rays to sample red,green,and blue colours according to a specificpattern. At the location of each pixel only one colour sample istaken and the values of the other colours must be interpolate d usingneighbouring samples. This colour plane interpolation is knownas demosaicing.Demosaic ing is generally carried out before compression.Recently,it was found that compression first schemes outperform the conventional demosaicing first schemes in terms of output image quality.An efficient prediction based lossless compression scheme for Bayer CFA images is proposed in this paper. It exploits a context matching technique to rank the neighboring pixels w hen predicting a pixel,an adaptive colour difference estimation scheme to remove the colour s pectral redundancy when handling red and blue samples,and an adaptive code word generation technique. Simulation results show the comparison of different coding scheme in terms of compression ratio.
This document discusses using computer vision and cameras for measurement applications. It begins by introducing the speaker and their background. It then discusses some of the challenges with computer vision accuracy, particularly when using cameras as contactless sensors outdoors. It provides examples of using video analytics to extract metadata like people counts and speed measurements. The document emphasizes that measurement accuracy depends on many factors like sensor calibration, installation, and environmental conditions.
IRJET- Crowd Density Estimation using Image ProcessingIRJET Journal
This document describes a research project that uses image processing techniques to estimate crowd density. Specifically, it uses skin color detection and morphological operations to identify and count the number of people in an image. It begins with an abstract that introduces the topic and objectives. It then provides background information on relevant color models and traditional crowd density estimation approaches. The proposed system is described as using skin color detection in the HSV color space to identify skin pixels, followed by morphological operations to find and count human faces, in order to efficiently and accurately estimate crowd density in images.
This document discusses a project to detect cyanosis, a medical condition where skin turns blue, using color detection in MATLAB. The project aims to develop a low-cost medical solution to remotely diagnose cyanosis. It works by taking an input image, converting it from RGB to grayscale, filtering out red and green colors to detect blue, and outputting whether cyanosis is detected. The document provides background on cyanosis, describes the proposed system workflow, and discusses results and conclusions that this technique can efficiently detect cyanosis and save time for doctors and patients.
This document discusses representative sampling and quality assurance/quality control procedures. It covers topics such as types of samples, ideal sampling locations, data quality objectives, and examples of proper and improper sampling techniques. Quality control measures like blanks, duplicates, and standards are described to ensure sample accuracy, precision, and to check for contamination in the sampling and analysis process. Maintaining proper sample handling and preservation techniques as well as adhering to hold times for analysis are also important aspects of quality control.
The document provides instructions for creating runs, defining protocols and graphs, viewing results, and performing background subtraction and quantification on the Smart Cycler system. It also discusses user administration, analysis settings, export options, melt analysis, and troubleshooting.
A DEEP LEARNING APPROACH FOR SEMANTIC SEGMENTATION IN BRAIN TUMOR IMAGESPNandaSai
Digital image processing is vast fields which can be using various applications. Which include Detection of criminal face, fingerprint authentication system, in medical field, object recognition etc. Brain tumor detection plays an important role in medical field. Brain tumor detection is detection of tumor affected part in the brain along with its shape size and boundary, so it useful in medical field.
Segmentation and the subsequent quantitative assessment of lesions in medical images provide valuable information for the analysis of neuropathologist and are important for planning of treatment strategies, monitoring of disease progression and prediction of patient outcome. For a better understanding of the pathophysiology of diseases, quantitative imaging can reveal clues about the disease characteristics and effects on particular anatomical structures
In this project, we propose a new novel DNN-based automatic detection of diabetic retinopathy. In deep neural networks are used for classify the images that indicate diabetic retinopathy. The main aim of this project is to find the suitable way to detect the problems and classify them. We propose an deep neural network (RBFNN) classifier gives high precision in grouping of these disease through spatial examination. The RBFNN classifier does not require an large training time, therefore the model production can be expedited. We further find from our data set of 80,000 images used in our proposed RBFNN achieves a sensitivity of 95% and an accuracy of 75% on 5000 validation images. The fuzzy c means clustering is used to store the information as the processed images in this project . Finally, the proposed system is developed using matlab simulation.
IRJET - Detection of Skin Cancer using Convolutional Neural NetworkIRJET Journal
This document presents a method for detecting skin cancer using convolutional neural networks. The proposed method involves collecting skin images, preprocessing them by removing noise and segmenting regions of interest, extracting features like asymmetry, border, color, and diameter, performing dimensionality reduction using principal component analysis, calculating dermoscopy scores, and classifying images as malignant or benign using a convolutional neural network (CNN) model. The CNN model achieves 92.5% accuracy in classification. The document provides background on skin cancer and challenges with traditional biopsy methods. It describes the system architecture including data collection, preprocessing, segmentation, feature extraction, and classification steps. Key aspects of CNNs like convolutional, ReLU, pooling, and fully connected layers are also overviewed
Hardware realization of Stereo camera and associated embedded systemIJERA Editor
Stereo camera has two lenses about the same distance apart as human eyes with a separate image sensor for each
lenses. This allows the camera to simulate human binocular vision, and therefore gives it ability to capture three
dimensional images. It detects depth information of the subject which allows user to capture image that are
instantly rendered in 3D. Stereo cameras are also required in stereo vision, a ranging method which finds its
application in almost every field. Still stereo 3D hasn’t yet become a standard because of technical problems,
including agronomy issues, cost, and lack of hardware and software standards. Due to above reasons, it is
important to achieve the low cost and standard hardware for 3D vision for which a novel architecture of a stereo
camera is required. This paper proposes to provide low cost solution to stereo cameras as cameras can be
designed as per requirement and mainly focuses on the processing of sensor raw image data.
This document presents a method for automated detection of dengue virus infection using images of white blood cells. The method involves preprocessing images using median filtering to remove noise, segmentation of white blood cells using morphological thresholding, feature extraction of cell nuclei using SIFT, and classification of cells as infected or non-infected using support vector machines (SVM). Testing showed the method achieved higher accuracy than existing techniques, with 75% accuracy compared to 65% for existing methods. The authors also explored using a two-layer feedforward neural network for classification, which achieved even higher accuracy than SVM.
This document provides an overview of a project to develop a face detection system integrated with a time management system. It discusses the background of time management systems, acknowledges those who provided assistance, and outlines the report structure. The problem analysis section describes the challenges of face recognition. It also covers system requirements, resources, and methodology. Research on programming tools like Java, C++, and MATLAB is presented to justify the tools selected for implementation.
This document discusses AncestryDNA's use of Hadoop to scale their DNA analysis pipeline as their database and processing needs grew rapidly over time. It describes how they initially ran the entire pipeline on a single machine, and then incrementally moved each step of the pipeline to run on Hadoop clusters, including running Admixture ethnicity processing with MapReduce, replacing GERMLINE matching with a new Jermline algorithm implemented in MapReduce, and moving phasing from Beagle to a new Underdog implementation in MapReduce. Each change significantly improved performance and allowed them to keep up with the growth of their DNA database and user base.
In this Spark session Ravi Saraogi talks about why estimating default risk in fund structures can be a challenging task. He presents on how this process has evolved over the years and the current methodologies for assessing such risks.
Fruitbreedomics workshop wp6 dna extraction methodsfruitbreedomics
The document summarizes methods for DNA extraction that were tested for use in marker-assisted breeding of fruit trees. Four extraction methods were evaluated: 1) "quick and dirty" commercial kits, 2) "direct PCR" kits, 3) magnetic particle-based kits, and 4) a homemade CTAB method. The homemade CTAB method was found to provide high quality DNA at the lowest cost and was well-suited for marker-assisted breeding work requiring analysis of hundreds of samples. The document also provides details on optimization of the KAPA 3G Plant PCR kit for short DNA fragments and highlights CTAB and KAPA 3G PCR as good extraction methods.
This document summarizes an application for automating quality assurance (QA) testing of positron emission tomography (PET) and single-photon emission computed tomography (SPECT) cameras. The application defines a template using a phantom mask, finds the best slice of a PET image to apply the mask to, fits the mask to the image slice, and calculates QA values and generates a report. It was developed for two GE camera models and uses image processing techniques like Hough transforms and contrast adjustment to select the best slice and identify regions of interest in the phantom image for analysis. The goals were to reduce time spent on manual QA testing and ensure results are consistent. The document discusses the GUI, classes, algorithms, and testing of the application
The document discusses three main problems with de novo assembly of next generation sequencing data and proposes solutions. The three problems are 1) large memory and compute requirements for assembly, 2) complexity of the assembly process and lack of standardized protocols, and 3) limited training opportunities that are difficult for students. The proposed solutions are standardized assembly protocols called khmer-protocols that provide copy-paste workflows for mRNAseq and metagenome assembly using techniques like digital normalization to reduce memory usage and make assembly scalable. The khmer-protocols are designed to be open, versioned, and reproducible to generate initial assembly results cheaply and easily in the cloud.
PREDICTION BASED LOSSLESS COMPRESSION SCHEME FOR BAYER COLOUR FILTER ARRAY IM...ijiert bestjournal
This paper presents an experimental evaluation of t he effectiveness of various techniques for lossless compression of CFA images. A colour image requires at least three colour samples at each pixel location. A digital camera would need th ree separate sensors to completely measure the image. In a three chip colour camera,the light entering the camera is split and projected onto each spectral sensor. Each sensor requires its prop er driving electronics,and the sensors have to be registered precisely. These additional requireme nts add a large expense to the system. Thus most commercial digital cameras use colour filterar rays to sample red,green,and blue colours according to a specificpattern. At the location of each pixel only one colour sample istaken and the values of the other colours must be interpolate d usingneighbouring samples. This colour plane interpolation is knownas demosaicing.Demosaic ing is generally carried out before compression.Recently,it was found that compression first schemes outperform the conventional demosaicing first schemes in terms of output image quality.An efficient prediction based lossless compression scheme for Bayer CFA images is proposed in this paper. It exploits a context matching technique to rank the neighboring pixels w hen predicting a pixel,an adaptive colour difference estimation scheme to remove the colour s pectral redundancy when handling red and blue samples,and an adaptive code word generation technique. Simulation results show the comparison of different coding scheme in terms of compression ratio.
This document discusses using computer vision and cameras for measurement applications. It begins by introducing the speaker and their background. It then discusses some of the challenges with computer vision accuracy, particularly when using cameras as contactless sensors outdoors. It provides examples of using video analytics to extract metadata like people counts and speed measurements. The document emphasizes that measurement accuracy depends on many factors like sensor calibration, installation, and environmental conditions.
IRJET- Crowd Density Estimation using Image ProcessingIRJET Journal
This document describes a research project that uses image processing techniques to estimate crowd density. Specifically, it uses skin color detection and morphological operations to identify and count the number of people in an image. It begins with an abstract that introduces the topic and objectives. It then provides background information on relevant color models and traditional crowd density estimation approaches. The proposed system is described as using skin color detection in the HSV color space to identify skin pixels, followed by morphological operations to find and count human faces, in order to efficiently and accurately estimate crowd density in images.
This document discusses a project to detect cyanosis, a medical condition where skin turns blue, using color detection in MATLAB. The project aims to develop a low-cost medical solution to remotely diagnose cyanosis. It works by taking an input image, converting it from RGB to grayscale, filtering out red and green colors to detect blue, and outputting whether cyanosis is detected. The document provides background on cyanosis, describes the proposed system workflow, and discusses results and conclusions that this technique can efficiently detect cyanosis and save time for doctors and patients.
This document discusses representative sampling and quality assurance/quality control procedures. It covers topics such as types of samples, ideal sampling locations, data quality objectives, and examples of proper and improper sampling techniques. Quality control measures like blanks, duplicates, and standards are described to ensure sample accuracy, precision, and to check for contamination in the sampling and analysis process. Maintaining proper sample handling and preservation techniques as well as adhering to hold times for analysis are also important aspects of quality control.
The document provides instructions for creating runs, defining protocols and graphs, viewing results, and performing background subtraction and quantification on the Smart Cycler system. It also discusses user administration, analysis settings, export options, melt analysis, and troubleshooting.
A DEEP LEARNING APPROACH FOR SEMANTIC SEGMENTATION IN BRAIN TUMOR IMAGESPNandaSai
Digital image processing is vast fields which can be using various applications. Which include Detection of criminal face, fingerprint authentication system, in medical field, object recognition etc. Brain tumor detection plays an important role in medical field. Brain tumor detection is detection of tumor affected part in the brain along with its shape size and boundary, so it useful in medical field.
Segmentation and the subsequent quantitative assessment of lesions in medical images provide valuable information for the analysis of neuropathologist and are important for planning of treatment strategies, monitoring of disease progression and prediction of patient outcome. For a better understanding of the pathophysiology of diseases, quantitative imaging can reveal clues about the disease characteristics and effects on particular anatomical structures
In this project, we propose a new novel DNN-based automatic detection of diabetic retinopathy. In deep neural networks are used for classify the images that indicate diabetic retinopathy. The main aim of this project is to find the suitable way to detect the problems and classify them. We propose an deep neural network (RBFNN) classifier gives high precision in grouping of these disease through spatial examination. The RBFNN classifier does not require an large training time, therefore the model production can be expedited. We further find from our data set of 80,000 images used in our proposed RBFNN achieves a sensitivity of 95% and an accuracy of 75% on 5000 validation images. The fuzzy c means clustering is used to store the information as the processed images in this project . Finally, the proposed system is developed using matlab simulation.
IRJET - Detection of Skin Cancer using Convolutional Neural NetworkIRJET Journal
This document presents a method for detecting skin cancer using convolutional neural networks. The proposed method involves collecting skin images, preprocessing them by removing noise and segmenting regions of interest, extracting features like asymmetry, border, color, and diameter, performing dimensionality reduction using principal component analysis, calculating dermoscopy scores, and classifying images as malignant or benign using a convolutional neural network (CNN) model. The CNN model achieves 92.5% accuracy in classification. The document provides background on skin cancer and challenges with traditional biopsy methods. It describes the system architecture including data collection, preprocessing, segmentation, feature extraction, and classification steps. Key aspects of CNNs like convolutional, ReLU, pooling, and fully connected layers are also overviewed
Hardware realization of Stereo camera and associated embedded systemIJERA Editor
Stereo camera has two lenses about the same distance apart as human eyes with a separate image sensor for each
lenses. This allows the camera to simulate human binocular vision, and therefore gives it ability to capture three
dimensional images. It detects depth information of the subject which allows user to capture image that are
instantly rendered in 3D. Stereo cameras are also required in stereo vision, a ranging method which finds its
application in almost every field. Still stereo 3D hasn’t yet become a standard because of technical problems,
including agronomy issues, cost, and lack of hardware and software standards. Due to above reasons, it is
important to achieve the low cost and standard hardware for 3D vision for which a novel architecture of a stereo
camera is required. This paper proposes to provide low cost solution to stereo cameras as cameras can be
designed as per requirement and mainly focuses on the processing of sensor raw image data.
This document presents a method for automated detection of dengue virus infection using images of white blood cells. The method involves preprocessing images using median filtering to remove noise, segmentation of white blood cells using morphological thresholding, feature extraction of cell nuclei using SIFT, and classification of cells as infected or non-infected using support vector machines (SVM). Testing showed the method achieved higher accuracy than existing techniques, with 75% accuracy compared to 65% for existing methods. The authors also explored using a two-layer feedforward neural network for classification, which achieved even higher accuracy than SVM.
This document provides an overview of a project to develop a face detection system integrated with a time management system. It discusses the background of time management systems, acknowledges those who provided assistance, and outlines the report structure. The problem analysis section describes the challenges of face recognition. It also covers system requirements, resources, and methodology. Research on programming tools like Java, C++, and MATLAB is presented to justify the tools selected for implementation.
This document discusses AncestryDNA's use of Hadoop to scale their DNA analysis pipeline as their database and processing needs grew rapidly over time. It describes how they initially ran the entire pipeline on a single machine, and then incrementally moved each step of the pipeline to run on Hadoop clusters, including running Admixture ethnicity processing with MapReduce, replacing GERMLINE matching with a new Jermline algorithm implemented in MapReduce, and moving phasing from Beagle to a new Underdog implementation in MapReduce. Each change significantly improved performance and allowed them to keep up with the growth of their DNA database and user base.
In this Spark session Ravi Saraogi talks about why estimating default risk in fund structures can be a challenging task. He presents on how this process has evolved over the years and the current methodologies for assessing such risks.
Fruitbreedomics workshop wp6 dna extraction methodsfruitbreedomics
The document summarizes methods for DNA extraction that were tested for use in marker-assisted breeding of fruit trees. Four extraction methods were evaluated: 1) "quick and dirty" commercial kits, 2) "direct PCR" kits, 3) magnetic particle-based kits, and 4) a homemade CTAB method. The homemade CTAB method was found to provide high quality DNA at the lowest cost and was well-suited for marker-assisted breeding work requiring analysis of hundreds of samples. The document also provides details on optimization of the KAPA 3G Plant PCR kit for short DNA fragments and highlights CTAB and KAPA 3G PCR as good extraction methods.
4. Slide Staining Project (SSP)
Why is this project important?
Currently, there is no rigid protocol that can be followed to get
the best stained slides (as primary focus is to make parasite visible)
There are a lots of stains available in the market but we don’t
know which one we should prefer
There is a very large variation in staining intensities for the slides
stained here at GMC lab
Stained slides have been observed to lose the stain over time
This project aims at figuring out the best way of staining (thin
blood smears) as well as the best storage conditions in order to
retain the staining for a long time
5. Slide Staining Project
Staining procedure was divided in the following parts and analysed briefly:
Making Smear – Drying 1 – Fixing – Drying 2 – Staining – Washing –
Drying 3 – Storage
1. Making Smear – Experienced Person Required
2. Drying 1 – Air drying! Is time important? Or just dry till all the moisture is lost..?
3. Fixing – Methanol. Time of fixing?
4. Drying 2 – Same issues as Drying 1. what if we keep it for weeks / years?
5. Staining – Method of staining (Horizontal surface Vs. Coplin Jars) / Amount of
Stain / Use of pipette or dropper to drop the stain / Diluting preferred or not?
6. Washing – pH of the solution / Buffer Vs. Distilled Vs. Tap water?
7. Drying 3 – Dry enough to remove moisture
8. Storage – Methods and Conditions of storage
Red – Problems | Black – No Problem
6. SSP: Methodology
PART A : Conducting the Experiments
I tried to optimize each of the above steps in the staining
procedure by conducting small experiments for each step
Parameters that were varied are time, method of washing,
method of staining, etc.
While experimenting for a particular staining step, all the
other staining steps we kept constant
The resulting slides were then compared for staining intensity
7. SSP: Methodology
PART B : Quantification of Images
5 images were taken from each slides as a representation of
the slide
ImageJ software was used to quantify the images in term of
Red channel Mean Grayscale values of cells and glass
Basically:- Select the cells in the image Measure the
intensity; Select the portion excluding cells (glass) Measure
the intensity
Lower the intensity, the darker are the cells i.e. higher is the
effect of staining
Higher the difference between the glass and cells intensities,
easier to distinguish the cells from glass
8. SSP: Methodology
PART C : Storage Conditions
Following are the parameters which were considered for
storage conditions:
Temperature of Storage: Room Temp Vs. 4 °C
No Covering Vs. Coverslips Vs. Oil Immersion Storage
Along with the above conditions, some of the slides were
re-stained to see if helps
Stained slides are stored currently in the respective conditions
and data for their current staining intensities is recorded
These slides should be taken out after a year and quantified
again for the staining intensity using the new images
In this way we can find out the % deterioration in the staining
intensity associated with each storage condition
9. SSP: Summarized Results
Stain Giemsa Hemacolor Giemsa Improved
Drying 1 Time 0-5 minutes 0-5 minutes 0-5 minutes
Fixing Time 1-5 Seconds 5 Seconds 1-5 Seconds
Drying 2 Time 30 - 60 minutes
Staining Method Horizontal Coplin Coplin
Staining Time 20 minutes 3 Sec | 15 Sec 20 minutes
Washing Method Buffer solution/ Distilled water in squeeze bottle
Drying 3 Time Till Drying
10. SSP: Key Points
Quality-wise: Giemsa Improved > Hemacolor > Giemsa
Drying 1 time can be reduced to couple of minutes
No fixing works only for Giemsa stain
Do not dry after fixing near the sink
Drying 2 time is very important in terms of stain absorption
Plastic rack use as a horizontal surface is not recommended
Changing the washing method to either distilled water or buffer
solution in squeeze bottle
Fixing solution often gets contaminated with the marker ink
Slides should not be stacked up until completely dried
Giemsa Improved stain is easily lost if wiped harshly
12. Color Coding Project (CCP)
Why is this project important?
Microsatellite experiments produce a lot of data. It’s very
difficult to make sense out of the whole bunch of data just by
looking at numbers
Better way is to convert the data into a color coded image
which uses different color scales to point out significant
differences in the data
Being automated, It reduces the manual tasks tremendously
and saves time and efforts
13. CCP: Methodology
MATLAB (Matrix Laboratory) is numerical computing
environment and fourth generation programming language
I have designed a program with the functions such as:
- Specifying sheet number from the excel file to color code
- Separate one patient sample from another
- color code the data from specified experimenters only
- Assign different color scales to different locus
- Specify the sensitivity
Excel files were first formatted into a specific form and were
then used as an input to the code
14. CCP: Methodology (cont.)
The program basically:
Imports data from excel file
Formats data in Matlab workspace
Measure the upper and lower limits of the data per locus
Scale the data in divisions proportional to the range of data
Assign different color scales to different locus
Assign one color for each division
Create figure with axes as the sample ID’s and locus names
Display the image
17. Automated Parasite Density Calculation
Why is this project important?
Counting Parasite Density for high parasite densities in quite
tedious and is very tiring when a large number of slides are to
be examined
This method uses multiple images for a slide as an input to the
software and counts the number of parasites and WBC for
each images and saves the information
This data can then be easily used to calculate the parasite
density
18. Methodology
ImageJ is a public domain, Java based image processing
program developed at National Institutes of Health (NIH)
User written plugins (here, programming codes) makes it very
easy to solve image processing related tasks
The code I’ve compiled sets intensity thresholds as well as size
thresholds to count the numbers of parasites as well as WBC’s
in a particular image
19. Methodology
Images from yellow light microscope at light intensity 2 seems
better for the program
Images are taken at 2x digital zoom using the digital camera
(microscope camera would help here)
And then cropped into rectangles
There are multiple ways of selecting the parasites which can
be found out just by playing around with the software
The one I’m using now is the one which splits the images into
RGB (i.e. Red, Green and Blue) channels and works on the G
channel for parasites and R channel for WBC’s
22. Results
I performed a manual counting on first 2 images which was:
1st Image: 121 Parasites
2nd Image: 135 Parasites
While the program counted:
1st Image: 110 Parasites
2nd Image: 145 Parasites
The size limits should be optimised by running the program on
multiple images and slides
This program will work good on high parasite density slides
24. Parasitemia Counting Analysis
What is Percentage Parasitemia?
Percentage Parasitemia is the percentage of infected RBCs in
the total RBCs
It essentially means that if 10 out of 100 RBCs are infected the
parasitemia is 10%
Why are reticles used in counting Parasitemia?
Counting all the RBCs is tedious and takes a lot of time
Reticles allows us to count the RBC in a smaller area and
then scale up the count to get value close to what real count
would have been
25. Parasitemia Counting Analysis
Miller Reticles
Miller reticle provides you with 2 squares in
which area of the smaller square is the
known fraction of the bigger square area
For example, in the right side pictures, the
top one is 1:5 and the bottom one is 1:9
It essentially means that if the area of the
smaller square in the top picture is 10 units,
the area of the bigger square is 50 units
And in the bottom image if area of smaller
square is 10, then that of bigger is 90
26. Parasitemia Counting Analysis
How to use these Reticles?
RBCs occupy area on the slide
Hence, we can estimate the number of RBCs in the bigger
square just by counting the RBCs in the smaller square and
then multiplying with the area factor
Consider this example of 1:9 reticle where
the RBCs are uniformly distributed
RBCs in the smaller square: 4
RBCs in the larger square: 4*9 = 36
27. Parasitemia Counting Analysis
How to obtain a formula? (for ex. Consider 1:5 reticle)
% Parasitemia = x 100
Infected RBCs
Total RBCs
Infected RBCs in bigger reticle
All RBCs in the bigger reticle
= x 100
=
Infected RBCs in bigger reticle x 100
All RBCs in the smaller reticle x 5
= x 20
Infected RBCs in bigger reticle
All RBCs in the smaller reticle
Area
Factor
28. Parasitemia Counting Analysis
But the problem does not end here!
What about the cells on the edges?
If you come across a Scenario like this you need to have a
protocol
Possible methods:-
Count RBCs on -
1. 2 of the four edges of the reticles
2. more than 50% inside
3. All the edges
4. None of the edges
29. Parasitemia Counting Analysis
Errors because of reticle misinformation!
Multiplying with 25 instead of 20 for 1:5 reticle creates:
(25 – 20) x 100 = 25 % Error (overestimation)
20
Multiplying with 10 instead of 11.11 for 1:9 reticle creates:
(11.11 – 10) x 100 = 9.99 % Error (underestimation)
11.11
30. Parasitemia Counting Analysis
Why is a definite protocol necessary worldwide?
We can explain it in terms of error propagation
Lets start with a case where actual % parasitemia is 10
Error in taking blood sample as representation of patient’s
complete blood. (say 10%) : Parasitemia becomes 11%
Error in scanning a particular area of slide as complete slide
representation. (say 10 %) : Parasitemia becomes 12.1%
Error in reticle ratio because of manufacturing errors (we had
25%) : Parasitemia becomes 15.125%
Error in parasitemia counting because of inaccurate methods
(in extreme case say 15%) : Parasitemia becomes 17.39%
31. Parasitemia Counting Analysis
Why is a definite protocol necessary worldwide?
So in the previous example, the parasitemia which was
actually 10% was estimated as 17.39 %
This is over 70% error
This error could go the other way around too and
underestimate the count
Also not having a definite protocol makes the calculation
individual biased creating differences from person to person as
well as from lab to lab
32. Concluding..
Slide staining project can be followed up for making rigid
conclusions
Some of the results are straight forward and should be
incorporated in daily staining procedures right away
Storage conditions should be monitored after a year
MATLAB is installed in the genomics room computer
Parasite Density Automation can be followed up and would be
very useful after obtaining a size limits
Reticle issue should be sorted out
Accurate parasitemia counting procedure should be adopted
as a protocol