This document summarizes work on digital normalization, a technique for reducing sequencing data size prior to assembly. Digital normalization works by discarding reads whose k-mer counts are below a cutoff, based on analysis of k-mer frequencies in the de Bruijn graph. It can remove over 95% of data in a single pass with fixed memory. Digital normalization enables assembly of large datasets in the cloud by reducing data size and memory requirements. The document acknowledges collaborators and funding sources and provides links for code, blogs, papers, and future events.
Talk on sneaky computation to be given at Reykjavik University. Sneaky computation is the using spare CPU cycles with little or no intervention from the user.
Artificial Neural Network Seminar - Google BrainRawan Al-Omari
it's our seminar in artificial neural network course, at F.I.T.E, AI Dept.
it's about Google Brain project, and who they using neural network in building it .
actually it's a very interesting project they work on it .
for more information about this project :
http://nyti.ms/T5E71e
Talk on sneaky computation to be given at Reykjavik University. Sneaky computation is the using spare CPU cycles with little or no intervention from the user.
Artificial Neural Network Seminar - Google BrainRawan Al-Omari
it's our seminar in artificial neural network course, at F.I.T.E, AI Dept.
it's about Google Brain project, and who they using neural network in building it .
actually it's a very interesting project they work on it .
for more information about this project :
http://nyti.ms/T5E71e
Kegler Brown's Workers' Compensation team presented a timely and important half-day seminar for Ohio employers on November 13, 2014.
Topics discussed included the ADA in workers' comp claims, substantial aggravation, limiting wage loss liability, slip and fall claims and a policy and legislative update.
Tony Fiore, attorney at Kegler Brown and director of government affairs for the Ohio State Council of SHRM, moderated "Hazed and Confused" at the 2015 Ohio SHRM Employment Law + Legislative Conference on June 3, 2015.
The presentation examined the impact of marijuana legalization in Colorado and Washington and Ohio's proposed ballot initiatives. Additional speakers included Kelley Duke from Ireland Stapleton Pryor & Pascoe (Denver, CO) and Cliff Webster from Carney, Badley, Spellman (Seattle, WA).
Christy presented "Promotional Gaming" at the NAPABA Central Regional Conference on April 5, 2014. The presentation examined forms of gaming, sources of authority, and the specifics of internet and promotional gaming.
The most flexible teaching platform for schools and universities to teach Mandarin Chinese as a Foreign Language. Delivered as SaaS (Software as a Service) on a monthly basis. Allows quality live 1-1 lessons with the school and universities own teachers and own students with simplicity and privacy protections.
"Advanced International Business Strategies for Entrepreneurs" was presented by Martijn Steger on January 26, 2012, for the Fisher College of Business at The Ohio State University.
Martijn provided attendees with important points that global business professionals should consider to be successful.
Kegler Brown's Workers' Compensation team presented a timely and important half-day seminar for Ohio employers on November 13, 2014.
Topics discussed included the ADA in workers' comp claims, substantial aggravation, limiting wage loss liability, slip and fall claims and a policy and legislative update.
Tony Fiore, attorney at Kegler Brown and director of government affairs for the Ohio State Council of SHRM, moderated "Hazed and Confused" at the 2015 Ohio SHRM Employment Law + Legislative Conference on June 3, 2015.
The presentation examined the impact of marijuana legalization in Colorado and Washington and Ohio's proposed ballot initiatives. Additional speakers included Kelley Duke from Ireland Stapleton Pryor & Pascoe (Denver, CO) and Cliff Webster from Carney, Badley, Spellman (Seattle, WA).
Christy presented "Promotional Gaming" at the NAPABA Central Regional Conference on April 5, 2014. The presentation examined forms of gaming, sources of authority, and the specifics of internet and promotional gaming.
The most flexible teaching platform for schools and universities to teach Mandarin Chinese as a Foreign Language. Delivered as SaaS (Software as a Service) on a monthly basis. Allows quality live 1-1 lessons with the school and universities own teachers and own students with simplicity and privacy protections.
"Advanced International Business Strategies for Entrepreneurs" was presented by Martijn Steger on January 26, 2012, for the Fisher College of Business at The Ohio State University.
Martijn provided attendees with important points that global business professionals should consider to be successful.
Francesc Alted (UberResearch GmbH), “New Trends In Storing And Analyzing Large Data Silos With Python”.
Bio: Teacher, developer and consultant in a wide variety of business applications. Particularly interested in the field of very large databases, with special emphasis in squeezing the last drop of performance out of computer as whole, i.e. not only the CPU, but the memory and I/O subsystems.
Keynote given at BOSC, 2010.
Does the hype surrounding cloud match the reality?
Can we use them to solve the problems in provisioning IT services to support next-generation sequencing?
The computational requirements of next generation sequencing is placing a huge demand on IT organisations .
Building compute clusters is now a well understood and relatively straightforward problem. However, NGS sequencing applications require large amounts of storage, and high IO rates.
This talk details our approach for providing storage for next-gen sequencing applications.
Talk given at BIO-IT World, Europe, 2009.
The Next-Generation sequencing data-deluge requires storage and compute services to be provisioned at an ever-increasing rate. Can Cloud (and last decade's buzzword, Grid), help us?
Talk given at the NHGRI Cloud computing workshop, 2010.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
2. Acknowledgements
Lab members involved Collaborators
Adina Howe (w/Tiedje) Jim Tiedje, MSU
Jason Pell
ArendHintze Billie Swalla, UW
RosangelaCanino- Janet Jansson, LBNL
Koning
Qingpeng Zhang Susannah Tringe, JGI
Elijah Lowe
LikitPreeyanon Funding
JiarongGuo
Tim Brom USDA NIFA; NSF IOS;
KanchanPavangadkar BEACON.
Eric McDonald
3.
4. “Be the change you want to see”
We are aggressivelyopen…
Everything discussed here:
Code: github.com/ged-lab/ ; BSD license
Blog: http://ivory.idyll.org/blog („titus brown blog‟)
Twitter: @ctitusbrown
Grants on Lab Web site:
http://ged.msu.edu/interests.html
(What‟s a good license??)
Preprints: on arXiv, q-bio:
„kmer-percolation arxiv‟
„diginormarxiv‟
5. The data catastrophe!
Data set sizes growing faster than compute capacity
(esp RAM).
Many biological algorithms don‟t scale all that well,
anyway.
Algorithmically, we want:
Single-pass.
Compression approaches (lossy or otherwise).
Low-memory data structures
I, personally, think the last thing in the world we need
is another standalone package: pre-filtering
approaches.
“Run our nifty approaches first, then feed into the
6. Digital normalization
Suppose you have a
dilution factor of A (10) to
B(1). To get 10x of B you
need to get 100x of A!
Overkill!!
This 100x will consume
disk space and, because
of errors, memory.
8. Digital normalization algorithm
for read in dataset:
if median_kmer_count(read) < CUTOFF:
update_kmer_counts(read)
save(read)
else:
# discard read
Note, single pass; fixed memory.
9. Digital normalization is efficient &
effective
• Single pass algorithm
• Fixed memory;
Algorithmic nerdvana! • Cheaper than assembly;
• Reduces assembly time;
• Scales assembly memory.
Brown et al., in review, PLoS On
13. Other key points
Virtually identical contigassembly; scaffolding works
but is not yet cookie-cutter.
Digital normalization changes the way de Bruijn graph
assembly scales from the size of your data set to
the size of the source sample.
Alwayslower memory than assembly: we never
collect most erroneous k-mers.
Digital normalization can be done once– and then
assembly parameter exploration can be done.
14. Quotable quotes.
Comment: “This looks like a great solution for
people who can’t afford real computers”.
OK, but:
“Buying ever bigger computers is a great
solution for people who don’t want to think
hard.”
To be less snide: both kinds of scaling are needed,
of course.
15. Why use diginorm?
Use the cloud to assemble any microbial
genomes incl. single-cell, many eukaryotic
genomes, most mRNAseq, and many
metagenomes.
Seems to provide leverage on addressing many
biological or sample prep problems (single-cell &
genome amplification MDA; metagenome;
heterozygosity).
And, well, the general idea of locus specific
graph analysis solves lots of things…
16. Some interim concluding
thoughts
Digital normalization-like approaches provide a
path to solving the majority of assembly scaling
problems, and will enable assembly on current
cloud computing hardware.
This is not true for highly diverse metagenome
environments…
For soil, we estimate that we need 50 Tbp / gram
soil. Sigh.
Biologists and bioinformaticianshate:
Throwing away data
Caveats in bioinformatics papers (which reviewers
like, note)
17. Streaming error correction.
We can do error trimming of genomic, MDA, transcriptomic,
metagenomic data in < 2 passes, fixed memory.
We have just submitted a proposal to adapt Euler or
Quake-like error correction (e.g. spectral alignment
problem) to this framework.
18. Side note: error correction is the
biggest “data” problem left in
sequencing.
Both for mapping & assembly.
19. Replication fu
In December 2011, I met Wes McKinney on a
train and he convinced me that I should look at
IPython Notebook.
This is an interactive Web notebook for data
analysis…
Hey, neat! We can use this for replication!
All of our figures can be regenerated from scratch,
on an EC2 instance, using a Makefile (data
pipeline) and IPython Notebook (figure generation).
Everything is version controlled.
Honestly not much work, and will be less the next
time.
20.
21. So… how‟d that go?
People who already cared thought it was nifty.
http://ivory.idyll.org/blog/replication-i.html
Almost nobody else cares ;(
Presub enquiry to editor: “Be sure that your paper can
be reproduced.” Uh, please read my letter to the end?
“Could you improve your Makefile? I want to
reimplementdiginorm in another language and reuse
your pipeline, but your Makefile is a mess.”
Incredibly useful, nonetheless. Already part of
undergraduate and graduate training in my lab;
helping us and others with next parpes; etc. etc. etc.
Life is way too short to waste on unnecessarily
replicating your own workflows, much less other
people’s.
22. Acknowledgements
Lab members involved Collaborators
Adina Howe (w/Tiedje) Jim Tiedje, MSU
Jason Pell
ArendHintze Billie Swalla, UW
RosangelaCanino- Janet Jansson, LBNL
Koning
Qingpeng Zhang Susannah Tringe, JGI
Elijah Lowe
LikitPreeyanon Funding
JiarongGuo
Tim Brom USDA NIFA; NSF IOS;
KanchanPavangadkar BEACON.
Eric McDonald
23. Advertisement!
Qingpeng Zhang (QP) will talk about our very
useful „khmer‟ software for efficiently counting k-
mers.
Want a simple Python lib for reading & indexing
FASTA/FASTQ? Check out screed.
“Better science through superior software.”