Processing malaria HTS results using KNIME: a tutorialGreg Landrum
Walks through a couple of KNIME Workflows for working with HTS Data.
The workflows are derived from the work described in this publication: https://f1000research.com/articles/6-1136/v2
An Early Evaluation of Running Spark on KubernetesDataWorks Summit
Kubernetes is an open source system to deploy, scale, and manage containerized applications anywhere. It builds on 15 years of running Google's containerized workloads and the valuable contributions from the open source community. To shepherd Kubernetes' evolution with the open source community, Google helped form the Cloud Native Computing Foundation (CNCF) and donated Kubernetes as the founding project. Starting in Spark 2.3.0, Spark has an experimental option to run clusters managed by Kubernetes. This feature makes use of the native Kubernetes scheduler that has been added to Spark. In this talk, we will provide a baseline understanding of what Kubernetes is, why it is relevant for the Spark community and how it compares to YARN. We will then look under the hood of Spark managed by Kubernetes to better understand how this works. Finally, we provide an early evaluation of this feature as well as our thoughts on the future of running Spark on Kubernetes.
Flink for Everyone: Self Service Data Analytics with StreamPipes - Philipp Ze...Flink Forward
This talk presents StreamPipes (https://www.streampipes.org), an open source self-service data analytics solution leveraging existing big data technologies such as Apache Flink to provide non-technical users with an easy and intuitive way to connect, analyze and exploit a variety of different streaming data sources for their use.
Newly arising IoT-driven use cases in domains such as manufacturing, smart city or autonomous driving often demand for continuous integration and processing of sensor data in order to derive time-sensitive actions. One example is the optimization of maintenance processes based on the current condition of machines (condition-based maintenance). While this is technically already well supported by the existing big data tool landscape, building such applications still require a crucial set of expertise ranging from general domain expertise, programming skills to deep knowledge on distributed and scalable systems. Such skills are usually not present in hardware-focused manufacturing companies.
To mitigate these shortcomings, StreamPipes allows non-technical users to leverage a graphical editor to model and deploy analytical tasks as pipelines in a drag and drop manner. Pipelines are built based on a toolbox of reusable data adapters, processors and sinks. Toolbox elements encapsulate dedicated algorithms (e.g., filter, aggregation, machine learning classifiers) implemented in big data processing engines such as Apache Flink communicating over an internal distributed messaging system (e.g. Apache Kafka).
In this talk, we present technologies and tools enabling flexible modeling of real-time processing pipelines by domain experts. We motivate our talk by showing real-world examples we gathered from a number of industry projects during the past years in Industrial IoT domains such as manufacturing and supply chain management. For instance, we show how StreamPipes eases the accessibility of big data tools for non-technical users based on examples such as supervising a fleet of autonomous electric delivery vehicles as well as data analytics in one of the largest test areas for autonomous driving in Germany.
Processing malaria HTS results using KNIME: a tutorialGreg Landrum
Walks through a couple of KNIME Workflows for working with HTS Data.
The workflows are derived from the work described in this publication: https://f1000research.com/articles/6-1136/v2
An Early Evaluation of Running Spark on KubernetesDataWorks Summit
Kubernetes is an open source system to deploy, scale, and manage containerized applications anywhere. It builds on 15 years of running Google's containerized workloads and the valuable contributions from the open source community. To shepherd Kubernetes' evolution with the open source community, Google helped form the Cloud Native Computing Foundation (CNCF) and donated Kubernetes as the founding project. Starting in Spark 2.3.0, Spark has an experimental option to run clusters managed by Kubernetes. This feature makes use of the native Kubernetes scheduler that has been added to Spark. In this talk, we will provide a baseline understanding of what Kubernetes is, why it is relevant for the Spark community and how it compares to YARN. We will then look under the hood of Spark managed by Kubernetes to better understand how this works. Finally, we provide an early evaluation of this feature as well as our thoughts on the future of running Spark on Kubernetes.
Flink for Everyone: Self Service Data Analytics with StreamPipes - Philipp Ze...Flink Forward
This talk presents StreamPipes (https://www.streampipes.org), an open source self-service data analytics solution leveraging existing big data technologies such as Apache Flink to provide non-technical users with an easy and intuitive way to connect, analyze and exploit a variety of different streaming data sources for their use.
Newly arising IoT-driven use cases in domains such as manufacturing, smart city or autonomous driving often demand for continuous integration and processing of sensor data in order to derive time-sensitive actions. One example is the optimization of maintenance processes based on the current condition of machines (condition-based maintenance). While this is technically already well supported by the existing big data tool landscape, building such applications still require a crucial set of expertise ranging from general domain expertise, programming skills to deep knowledge on distributed and scalable systems. Such skills are usually not present in hardware-focused manufacturing companies.
To mitigate these shortcomings, StreamPipes allows non-technical users to leverage a graphical editor to model and deploy analytical tasks as pipelines in a drag and drop manner. Pipelines are built based on a toolbox of reusable data adapters, processors and sinks. Toolbox elements encapsulate dedicated algorithms (e.g., filter, aggregation, machine learning classifiers) implemented in big data processing engines such as Apache Flink communicating over an internal distributed messaging system (e.g. Apache Kafka).
In this talk, we present technologies and tools enabling flexible modeling of real-time processing pipelines by domain experts. We motivate our talk by showing real-world examples we gathered from a number of industry projects during the past years in Industrial IoT domains such as manufacturing and supply chain management. For instance, we show how StreamPipes eases the accessibility of big data tools for non-technical users based on examples such as supervising a fleet of autonomous electric delivery vehicles as well as data analytics in one of the largest test areas for autonomous driving in Germany.
IT Monitoring in the Era of Containers | Luca Deri Founder & Project Lead | ntopInfluxData
Network traffic monitoring tools are traditionally based on the packet paradigm where tools need to analyse each incoming and outgoing packet. As systems are moving towards a micro-service oriented architecture based on containers, the packet paradigm is no longer enough to provide IT visibility as services interact inside a system and not over a network where it is possible to install network sensors. This talk will explain how open source tools designed by ntop on top of InfluxDB allow packet monitoring tools to be complemented with container monitoring and thus implement a lightweight visibility solution for modern IT infrastructures.
Case Studies in advanced analytics with RWit Jakuczun
A talk I gave at SQLDay 2017:
About 1,5 years ago Microsoft finalised acquisition of Revolution Analytics – a provider of software and services for R. In my opinion this was one of the most important event for R community. Now it is crucial to present its capabilities to SQL Server community. It will be beneficial for both parties. I will present three case studies: cash optimisation in Deutsche Bank, midterm model for energy prices forecasting, workforce demand optimising. The case studies were implemented with our analytical workflow R Suite that will be also shortly presented.
Know your R usage workflow to handle reproducibility challengesWit Jakuczun
R is used in a vast ways. From pure ad-hoc by hobbysts to an organized and structured way in an enterprise. Each way of R usage brings different reproducibility challenges. Going through range of typical workflows we will show that understanding reproducibility must start with understanding your workflow. Presenting workflows we will show how we deal reproducibiilty challenges with open-source R Suite (http://rsuite.io) solution developed by us to support our large scale R development.
Developing Your Own Flux Packages by David McKay | Head of Developer Relation...InfluxData
Flux is easy to contribute to, and it is easy to share functions and libraries of Flux code with other developers. Although there are many functions in the language, the true power of Flux is its ability to be extended with custom functions. In this session, David will show you how to write your own custom function to perform some new analytics.
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...Revolution Analytics
[Presentation by Skylar Lyon at DataWeek 2014, September 17 2014.]
I recently faced the task of how to scale out an existing analytics process. The schedule was compressed - it always is in my world. The data was big - 400+ million rows waiting in database. What did I do? I offered my favorite type of solution - quick and dirty.
At the outset, I wasn't sure how easy it would be. Nor was I certain of realized performance gains. But the concept seemed sound and the exercise fun. Let's move the compute to the data via Revolution R Enterprise for Teradata.
This presentation outlines my approach in leveraging a colleague's R models as I experimented with running R in-database. Would my path lead to significant improvement? Could it be used to productionalize the workflow?
Presentation given on the 15th July 2021 at the Airflow Summit 2021
Conference website: https://airflowsummit.org/sessions/2021/clearing-airflow-obstructions/
Recording: https://www.crowdcast.io/e/airflowsummit2021/40
Managing large (and small) R based solutions with R SuiteWit Jakuczun
The presentation I gave at DataMass Gdańsk Summit in 2017:
R is a great tool for data scientist. Being very dynamic and popular is now one of the most important technology on the market. Unfortunately out-of-the-box R is not suited for large scale applications. I will present R Suite that is an open-source solution developed by us for us to manage R development process.
Raster Algebra mit Oracle Spatial und uDigKarin Patenge
Im Foliensatz ist die Integration von Oracle Spatial mit Open Source Technolgien beschrieben. Am Beispiel von uDig wird Schritt-für-Schritt aufgezeigt, wie es zusammen mit Oracle Spatial für die Rasterdatenanalyse eingesetzt werden hier. Beispielhaft wird ein Vegetationsindex (NVDI) berechnet.
Bei Interesse gern auch weiterlesen auf dem Oracle Spatial Blog (http://oracle-spatial.blogspot.com).
Managing your own PostgreSQL servers is sometimes a burden your business does not want. In this talk we will provide an overview of some of the public cloud offerings available for hosted PostgreSQL and discuss a number of strategies for migrating your databases with a minimum of downtime.
HNSciCloud has shared with the IT experts in scientific computing belonging to the HEPiX forum, the status of the ongoing Pre-Commercial Procurement of innovative cloud services and what the expected results might be.
Deploying MariaDB for HA on Google Cloud PlatformMariaDB plc
Google Cloud Platform (GCP) is a rising star in the world of cloud infrastructure. Of course there is Google CloudSQL, but sometimes you need more control over your databases. In this session, Matthias Crauwels of Pythian guides you through the process of deploying and managing MariaDB in the cloud on GCP.
Hybrid and Multi-Cloud Strategies for Kubernetes with GitOpsSonja Schweigert
One of the biggest advantages Kubernetes has to offer is that it is agnostic to infrastructure and capable of managing diverse workloads running on different compute resources. This allows organizations to manage multiple developer platforms, who can operate across many environments such as on premise, hybrid and multiple clouds.
Streamlined processes and automation is pivotal for operations when managing clusters at scale and maintaining security and policy checks. Paul Curtis, Principal Solutions Architect will demonstrate GitOps and Weave Kubernetes Platform in a hybrid and multi-cloud setup.
Learn how to:
Use model-driven automation to increases reliability and stability across environments
Simplify multi-cluster management with GitOps
Enable developers to push code to production daily (self-service)
Improve utilization and capacity management through Kubernetes platforms on cloud and on-premise infrastructure
In the big data world, it's not always easy for Python users to move huge amounts of data around. Apache Arrow defines a common format for data interchange, while Arrow Flight introduced in version 0.11.0, provides a means to move that data efficiently between systems. Arrow Flight is a framework for Arrow-based messaging built with gRPC. It enables data microservices where clients can produce and consume streams of Arrow data to share it over the wire. In this session, I'll give a brief overview of Arrow Flight from a Python perspective, and show that it's easy to build high performance connections when systems can talk Arrow. I'll also cover some ongoing work in using Arrow Flight to connect PySpark with TensorFlow - two systems with great Python APIs but very different underlying internal data.
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...KNIMESlides
Here are the slides from our Data Science Learnathons. A learnathon is where we learn more about the data science cycle - data access, data blending, data preparation, model training, optimization, testing, and deployment. We also work in groups to hack a workflow-based solution to guided exercises. The tool of choice for this learnathon is KNIME Analytics Platform.
KNIME hosted a data science learnathon in London on 18.1.2018. You can find the slides that were presented here.
For more upcoming events and courses visit: https://www.knime.com/learning/events
IT Monitoring in the Era of Containers | Luca Deri Founder & Project Lead | ntopInfluxData
Network traffic monitoring tools are traditionally based on the packet paradigm where tools need to analyse each incoming and outgoing packet. As systems are moving towards a micro-service oriented architecture based on containers, the packet paradigm is no longer enough to provide IT visibility as services interact inside a system and not over a network where it is possible to install network sensors. This talk will explain how open source tools designed by ntop on top of InfluxDB allow packet monitoring tools to be complemented with container monitoring and thus implement a lightweight visibility solution for modern IT infrastructures.
Case Studies in advanced analytics with RWit Jakuczun
A talk I gave at SQLDay 2017:
About 1,5 years ago Microsoft finalised acquisition of Revolution Analytics – a provider of software and services for R. In my opinion this was one of the most important event for R community. Now it is crucial to present its capabilities to SQL Server community. It will be beneficial for both parties. I will present three case studies: cash optimisation in Deutsche Bank, midterm model for energy prices forecasting, workforce demand optimising. The case studies were implemented with our analytical workflow R Suite that will be also shortly presented.
Know your R usage workflow to handle reproducibility challengesWit Jakuczun
R is used in a vast ways. From pure ad-hoc by hobbysts to an organized and structured way in an enterprise. Each way of R usage brings different reproducibility challenges. Going through range of typical workflows we will show that understanding reproducibility must start with understanding your workflow. Presenting workflows we will show how we deal reproducibiilty challenges with open-source R Suite (http://rsuite.io) solution developed by us to support our large scale R development.
Developing Your Own Flux Packages by David McKay | Head of Developer Relation...InfluxData
Flux is easy to contribute to, and it is easy to share functions and libraries of Flux code with other developers. Although there are many functions in the language, the true power of Flux is its ability to be extended with custom functions. In this session, David will show you how to write your own custom function to perform some new analytics.
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...Revolution Analytics
[Presentation by Skylar Lyon at DataWeek 2014, September 17 2014.]
I recently faced the task of how to scale out an existing analytics process. The schedule was compressed - it always is in my world. The data was big - 400+ million rows waiting in database. What did I do? I offered my favorite type of solution - quick and dirty.
At the outset, I wasn't sure how easy it would be. Nor was I certain of realized performance gains. But the concept seemed sound and the exercise fun. Let's move the compute to the data via Revolution R Enterprise for Teradata.
This presentation outlines my approach in leveraging a colleague's R models as I experimented with running R in-database. Would my path lead to significant improvement? Could it be used to productionalize the workflow?
Presentation given on the 15th July 2021 at the Airflow Summit 2021
Conference website: https://airflowsummit.org/sessions/2021/clearing-airflow-obstructions/
Recording: https://www.crowdcast.io/e/airflowsummit2021/40
Managing large (and small) R based solutions with R SuiteWit Jakuczun
The presentation I gave at DataMass Gdańsk Summit in 2017:
R is a great tool for data scientist. Being very dynamic and popular is now one of the most important technology on the market. Unfortunately out-of-the-box R is not suited for large scale applications. I will present R Suite that is an open-source solution developed by us for us to manage R development process.
Raster Algebra mit Oracle Spatial und uDigKarin Patenge
Im Foliensatz ist die Integration von Oracle Spatial mit Open Source Technolgien beschrieben. Am Beispiel von uDig wird Schritt-für-Schritt aufgezeigt, wie es zusammen mit Oracle Spatial für die Rasterdatenanalyse eingesetzt werden hier. Beispielhaft wird ein Vegetationsindex (NVDI) berechnet.
Bei Interesse gern auch weiterlesen auf dem Oracle Spatial Blog (http://oracle-spatial.blogspot.com).
Managing your own PostgreSQL servers is sometimes a burden your business does not want. In this talk we will provide an overview of some of the public cloud offerings available for hosted PostgreSQL and discuss a number of strategies for migrating your databases with a minimum of downtime.
HNSciCloud has shared with the IT experts in scientific computing belonging to the HEPiX forum, the status of the ongoing Pre-Commercial Procurement of innovative cloud services and what the expected results might be.
Deploying MariaDB for HA on Google Cloud PlatformMariaDB plc
Google Cloud Platform (GCP) is a rising star in the world of cloud infrastructure. Of course there is Google CloudSQL, but sometimes you need more control over your databases. In this session, Matthias Crauwels of Pythian guides you through the process of deploying and managing MariaDB in the cloud on GCP.
Hybrid and Multi-Cloud Strategies for Kubernetes with GitOpsSonja Schweigert
One of the biggest advantages Kubernetes has to offer is that it is agnostic to infrastructure and capable of managing diverse workloads running on different compute resources. This allows organizations to manage multiple developer platforms, who can operate across many environments such as on premise, hybrid and multiple clouds.
Streamlined processes and automation is pivotal for operations when managing clusters at scale and maintaining security and policy checks. Paul Curtis, Principal Solutions Architect will demonstrate GitOps and Weave Kubernetes Platform in a hybrid and multi-cloud setup.
Learn how to:
Use model-driven automation to increases reliability and stability across environments
Simplify multi-cluster management with GitOps
Enable developers to push code to production daily (self-service)
Improve utilization and capacity management through Kubernetes platforms on cloud and on-premise infrastructure
In the big data world, it's not always easy for Python users to move huge amounts of data around. Apache Arrow defines a common format for data interchange, while Arrow Flight introduced in version 0.11.0, provides a means to move that data efficiently between systems. Arrow Flight is a framework for Arrow-based messaging built with gRPC. It enables data microservices where clients can produce and consume streams of Arrow data to share it over the wire. In this session, I'll give a brief overview of Arrow Flight from a Python perspective, and show that it's easy to build high performance connections when systems can talk Arrow. I'll also cover some ongoing work in using Arrow Flight to connect PySpark with TensorFlow - two systems with great Python APIs but very different underlying internal data.
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...KNIMESlides
Here are the slides from our Data Science Learnathons. A learnathon is where we learn more about the data science cycle - data access, data blending, data preparation, model training, optimization, testing, and deployment. We also work in groups to hack a workflow-based solution to guided exercises. The tool of choice for this learnathon is KNIME Analytics Platform.
KNIME hosted a data science learnathon in London on 18.1.2018. You can find the slides that were presented here.
For more upcoming events and courses visit: https://www.knime.com/learning/events
In March and April 2018 KNIME hosted a series of Learnathons in the US. You can find the slides that were presented here.
For more upcoming events and courses visit: https://www.knime.com/learning/events
KNIME Data Science Learnathon: From Raw Data To DeploymentKNIMESlides
Here are the slides from our Data Science Learnathons. A learnathon is where we learn more about the data science cycle - data access, data blending, data preparation, model training, optimization, testing, and deployment. We also work in groups to hack a workflow-based solution to guided exercises. The tool of choice for this learnathon is KNIME Analytics Platform.
Webinar: Deep Learning Pipelines Beyond the LearningMesosphere Inc.
Mesosphere technical lead Joerg Schad looks at the complete deep learning pipeline. In these slides, Joerg addresses commonly asked questions, such as:
1. How can we easily deploy distributed deep learning frameworks on any public or private infrastructure?
2. How can we manage different deep learning frameworks on a single cluster, especially considering heterogeneous resources such as GPUs?
3. What is the best UI for a data scientist to work with the cluster?
4. How can we store & serve models at scale?
5. How can we update models that are currently in use without causing downtime for the service using them?
6. How can we monitor the entire pipeline and track performance of the deployed models?
Model Parallelism in Spark ML Cross-Validation with Nick Pentreath and Bryan ...Databricks
Tuning a Spark ML model with cross-validation can be an extremely computationally expensive process. As the number of hyperparameter combinations increases, so does the number of models being evaluated. The default configuration in Spark is to evaluate each of these models one-by-one to select the best performing. When running this process with a large number of models, if the training and evaluation of a model does not fully utilize the available cluster resources then that waste will be compounded for each model and lead to long run times.
Enabling model parallelism in Spark cross-validation, from Spark 2.3, will allow for more than one model to be trained and evaluated at the same time and make better use of cluster resources. We will go over how to enable this setting in Spark, what effect this will have on an example ML pipeline and best practices to keep in mind when using this feature.
Additionally, we will discuss ongoing work in progress to reduce the amount of computation required when tuning ML pipelines by eliminating redundant transformations and intelligently caching intermediate datasets. This can be combined with model parallelism to further reduce the run time of cross-validation for complex machine learning pipelines.
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018Amazon Web Services
Video-based tools have enabled advancements in computer vision, such as in-vehicle use cases for AI. However, it is not always possible to send this data to the cloud to be processed. In this session, learn how to train machine learning models using Amazon SageMaker and deploy them to an edge device using AWS Greengrass, enabling you process data quickly at the edge, even when there is no connectivity.
If you understand the rule engine, especially how works RETE algorithm, You may use this for Machine Learning. This slide used at Red Hat Forum Tokyo 2018 session.
Amazon CI/CD Practices for Software Development Teams - SRV320 - Anaheim AWS ...Amazon Web Services
At Amazon, continuous integration and continuous delivery (CI/CD) techniques enable collaboration, increase agility, and deliver a high-quality product faster. In this talk, we walk you through the practices we use for both the CI and the CD of software delivery. For CI, we showcase how we incorporate pull requests to increase team collaboration. We also demonstrate how to optimize CI workflows for speed with caching, code analysis, and integration testing. For CD, we share example safety mechanisms, including canary testing, rollbacks, and Availability Zone redundancy. We use the AWS developer tools that were designed based on the internal Amazon tooling: AWS CodeCommit, AWS CodeBuild, AWS CodePipeline, AWS CodeDeploy, and AWS X-Ray.
OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...NETWAYS
Machine learning is a method of data analysis that automates the building of analytical models. By using algorithms that iteratively learn from data, computers are able to find hidden insights without the help of explicit programming. These insights bring tremendous benefits into many different domains. For business users, in particular, these insights help organizations improve customer experience, become more competitive, and respond much faster to opportunities or threats. The availability of very powerful in-memory computing platforms, such as Apache Ignite, means that more organizations can benefit from machine learning today. In this presentation we will look at some of the main components of Apache Ignite, such as the Compute Grid, Data Grid and the Machine Learning Grid. Through examples, attendees will learn how Apache Ignite can be used for data analysis.
Amazon CI/CD Practices for Software Development Teams - SRV320 - Atlanta AWS ...Amazon Web Services
At Amazon, continuous integration and continuous delivery (CI/CD) techniques enable collaboration, increase agility, and deliver a high-quality product faster. In this talk, we walk through best practices for both the CI and the CD of software delivery. For CI, we showcase how to incorporate pull requests to increase team collaboration, and demonstrate how to optimize CI workflows for speed with caching, code analysis, and integration testing. For CD, we share example safety mechanisms, including canary testing, rollbacks, and Availability Zone redundancy. We use AWS developer tools that were designed based on the internal Amazon tooling: AWS CodeCommit, AWS CodeBuild, AWS CodePipeline, and AWS CodeDeploy.
Sharing and Deploying Data Science with KNIME ServerKNIMESlides
You’re currently using the open source KNIME Analytics Platform, but looking for more functionality - especially for working across teams and business units?
KNIME Server is the enterprise software for team based collaboration, automation, management, and deployment of data science workflows, data, and guided analytics. Non experts are given access to data science via KNIME Server WebPortal or can use REST APIs to integrate workflows as analytic services to applications, IoT, and systems.
These slides are from a webinar where we introduce all KNIME Server features. It covers everything you need to manage your analytics at scale - deploying your workflows for sharing and collaboration, scheduling and automating tasks, templating and version control, as well as enterprise integration. it highlights the power of the REST API of KNIME Server, and the KNIME WebPortal - the ideal way for bringing data analytics to your non experts.
View the webinar at: https://www.youtube.com/watch?v=qFV4P9-enZk
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...Amazon Web Services
Even the best continuous delivery and DevOps practices cannot guarantee that there will be no issues in production. The rise of Site Reliability Engineering (SRE) has promoted new ways to automate resilience into your system and applications to circumvent potential problems, but it’s time to “shift-left” this effort into engineering. In this session, learn to leverage AWS Lambda functions as “remediation as code.” We show how to make it part of your continuous delivery process and orchestrate the invocation of Self-Healing Lambda functions in case of unexpected situations impacting the reliability of your system. Gone are the days of traditional operation teams—it’s the rise of “shift-lefters”! This session is brought to you by AWS partner, Dynatrace.
Amazon CI/CD Practices for Software Development Teams - SRV320 - Chicago AWS ...Amazon Web Services
At Amazon, continuous integration and continuous delivery (CI/CD) techniques enable collaboration, increase agility, and deliver a high-quality product faster. In this talk, we walk you through the practices we use for both the CI and CD of software delivery. For CI, we showcase how we incorporate pull requests to increase team collaboration. We also demonstrate how to optimize CI workflows for speed with caching, code analysis, and integration testing. For CD, we share example safety mechanisms, including canary testing, rollbacks, and Availability Zone redundancy. We use the AWS developer tools whose designs were based on the internal Amazon tooling: AWS CodeCommit, AWS CodeBuild, AWS CodePipeline, AWS CodeDeploy, and AWS X-Ray.
Similar to How Do You Build and Validate 1500 Models and What Can You Learn from Them? (20)
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
This pdf is about the Schizophrenia.
For more details visit on YouTube; @SELF-EXPLANATORY;
https://www.youtube.com/channel/UCAiarMZDNhe1A3Rnpr_WkzA/videos
Thanks...!