The document discusses genomic analysis of the fire ant Solenopsis invicta. It notes that the genome sequencing of a Gp-9 B male fire ant revealed an expansion of lipid-processing gene families and over 420 putative olfactory receptors, more than any other insect. It also identified a functional DNA methylation system. Previous research had linked the fire ant's social structure to its Gp-9 locus, but genome sequencing provided more genomic context around this gene and others related to social behavior and chemical signaling.
The document discusses genomic research on fire ants. It summarizes that the genome of a fire ant was sequenced, which revealed an expansion of lipid-processing and olfactory receptor genes. Over 400 putative olfactory receptors were identified, more than any other insect sequenced so far. The genome also contains a functional DNA methylation system. Previous research on fire ants linked their social structure to a single gene (Gp-9), but sequencing of the entire genome allowed further investigation into other genes that may be linked.
GS Rubber Industries is a leading manufacturer of molded rubber products founded in 1993. It focuses on quality, efficiency, and attention to detail fostered from its origins in the automotive and defense industries. The company offers rubber molding expertise and state-of-the-art technology for parts like caps, seals, mounts, and boots across many industries.
Este documento proporciona una introducción a la base de datos de políticas de enrutamiento (RPDB) en Linux y los comandos iproute2. Explica conceptos clave como direcciones IP, CIDR, tablas y ámbitos, e ilustra el uso de los comandos ip link, ip address y ip route para configurar y mostrar la configuración de enlaces de red, direcciones y rutas. El objetivo es explicar la teoría y los ejemplos prácticos de la gestión de redes con herramientas iproute2 en Linux.
Improvement of Military Planes From War to WarTodd Roberts
Oswald Boelcke's 8 rules for aerial combat emphasized attacking from a position of advantage, committing fully to attacks, firing at close range, maintaining awareness of one's opponent, targeting vulnerable areas, countering opponents' attacks, maintaining an escape route, and coordinating attacks in groups. Over the decades, fighter aircraft increased dramatically in range, speed, altitude capability, and armaments to gain advantages in aerial combat.
This document discusses genome assembly and sequencing. It provides advice on sampling, sequencing approaches, scaffolding, input data quality assurance, trimming, filtering, choosing an assembler, and testing parameters. It notes that there is no single best way and that testing many combinations of software, parameters, and formats is required. Understanding everything is not necessary, and 20% of the effort can yield 80% of the results. It also briefly describes the fire ant Solenopsis invicta and its two social forms.
The document discusses organizing computational biology projects. It recommends using a logical directory structure with a common root directory for related projects. Within each project directory, it suggests top-level directories for data, results, source code, documents, and binaries. For results directories, it advises creating subdirectories for each experiment with names indicating the date and topic. Maintaining a lab notebook to document experiments and their progress and conclusions is also recommended.
The document discusses genomic research on fire ants. It summarizes that the genome of a fire ant was sequenced, which revealed an expansion of lipid-processing and olfactory receptor genes. Over 400 putative olfactory receptors were identified, more than any other insect sequenced so far. The genome also contains a functional DNA methylation system. Previous research on fire ants linked their social structure to a single gene (Gp-9), but sequencing of the entire genome allowed further investigation into other genes that may be linked.
GS Rubber Industries is a leading manufacturer of molded rubber products founded in 1993. It focuses on quality, efficiency, and attention to detail fostered from its origins in the automotive and defense industries. The company offers rubber molding expertise and state-of-the-art technology for parts like caps, seals, mounts, and boots across many industries.
Este documento proporciona una introducción a la base de datos de políticas de enrutamiento (RPDB) en Linux y los comandos iproute2. Explica conceptos clave como direcciones IP, CIDR, tablas y ámbitos, e ilustra el uso de los comandos ip link, ip address y ip route para configurar y mostrar la configuración de enlaces de red, direcciones y rutas. El objetivo es explicar la teoría y los ejemplos prácticos de la gestión de redes con herramientas iproute2 en Linux.
Improvement of Military Planes From War to WarTodd Roberts
Oswald Boelcke's 8 rules for aerial combat emphasized attacking from a position of advantage, committing fully to attacks, firing at close range, maintaining awareness of one's opponent, targeting vulnerable areas, countering opponents' attacks, maintaining an escape route, and coordinating attacks in groups. Over the decades, fighter aircraft increased dramatically in range, speed, altitude capability, and armaments to gain advantages in aerial combat.
This document discusses genome assembly and sequencing. It provides advice on sampling, sequencing approaches, scaffolding, input data quality assurance, trimming, filtering, choosing an assembler, and testing parameters. It notes that there is no single best way and that testing many combinations of software, parameters, and formats is required. Understanding everything is not necessary, and 20% of the effort can yield 80% of the results. It also briefly describes the fire ant Solenopsis invicta and its two social forms.
The document discusses organizing computational biology projects. It recommends using a logical directory structure with a common root directory for related projects. Within each project directory, it suggests top-level directories for data, results, source code, documents, and binaries. For results directories, it advises creating subdirectories for each experiment with names indicating the date and topic. Maintaining a lab notebook to document experiments and their progress and conclusions is also recommended.
The document summarizes research on the fire ant Solenopsis invicta. It describes how sequencing the genome of a fire ant revealed expansions in gene families related to odor detection and lipid processing. The genome sequencing found over 400 putative olfactory receptor genes, more than any other insect sequenced. It also discovered the fire ant has a functional DNA methylation system. Previous research had linked the fire ant's single or multiple queen social forms to a single gene called Gp-9, but the new genome sequencing allowed a more comprehensive analysis of genes near Gp-9.
This document summarizes research on social evolution in ants. It discusses various ant species like leaf-cutter ants, weaver ants, and Forelius pusillus ants. It also discusses fire ants and their single and multiple queen social forms which are associated with their Gp-9 gene. Genome sequencing of the fire ant revealed expansion of odor receptor genes and other findings. The document covers topics like kin selection, evolution of eusociality, and using modern technologies to study insect societies.
Keynote talk given at Fairdom User meeting http://fair-dom.org/communities/users/barcelona-2016-first-user-meeting/ .
I begin by summarising how we apply molecular approaches to understand social behaviour in ants. Subsequently, I give an overview of the data-handling challenges the genomic bioinformatics community faces. Finally, I give an overview of some of the tools and approaches my lab have developed to help us get things done better, faster, more reliably and more reproducibly.
The document discusses the genetic basis of social organization in fire ant populations. Researchers used RAD sequencing of haploid males to discover SNPs and genotype individuals at over 2,400 loci. Principal component analysis separated individuals into two clusters corresponding to their social form (single or multiple queen), with the first principal component explaining over 12% of the variance. A region on chromosome 13 containing the Gp-9 gene was completely associated with social form. This research identified a major gene influencing an important social trait using next-generation sequencing techniques.
50% social chromosomes in ants, 50% bioinformatics for genomics in emerging model organisms. Given at #CTBio http://pathogenomics.bham.ac.uk/blog/2013/07/cream-teas-and-bioinformatics-balti-and-bioinformatics-goes-on-its-holidays/
Apologies - videos and transitions are largely missing as part of the PDF conversion.
The work referenced here includes:
http://dx.doi.org/10.1038/nature11832
http://dx.doi.org/10.1073/pnas.1009690108
http://sequenceserver.com
https://github.com/yeban/afra & http://afra.sbcs.qmul.ac.uk
https://github.com/monicadragan/GeneValidator
The document appears to be about the genome of the red fire ant Solenopsis invicta. It summarizes some key findings from sequencing and analyzing the fire ant genome. These include an expansion of gene families related to lipid processing and cuticular hydrocarbons. It also found over 420 putative olfactory receptor genes, more than any other insect genome sequenced. Additionally, the genome appears to have a functional DNA methylation system.
This document discusses the experience of a researcher in genomics with applying FAIR and open approaches. It notes that making data and analysis methods FAIR and open can increase visibility, drive citations, and facilitate collaboration. However, it also enables competition to more easily access and utilize resources without contributing. Striking the right balance between openness and protecting competitive advantages is challenging. Overall, the researcher finds FAIR and open principles have greatly increased the impact and robustness of their work, but there are also costs to consider.
2018 08-reduce risks of genomics researchYannick Wurm
Geoffrey Chang, a protein crystallographer at The Scripps Research Institute, had his career trajectory disrupted when several of his high-profile papers describing protein structures had to be retracted. An in-house software program Chang's lab used to process diffraction data from protein crystals introduced a sign error that inverted the structures, invalidating biological interpretations. This included a 2001 Science paper describing the structure of the MsbA protein. A 2006 Nature paper by Swiss researchers casting doubt on Chang's MsbA structure led him to discover the software error. Chang and his co-authors sincerely regretted the confusion and unproductive research caused by the need to retract their influential papers.
Geoffrey Chang was a prominent structural biologist who received prestigious early career awards. However, his work came under scrutiny when other researchers discovered errors in his published protein structures due to a problem with his in-house data analysis software. This led Chang to retract 5 of his papers describing protein structures. The retractions were costly for Chang's career and reputation as well as for other researchers who had performed follow-up work based on the incorrect structures. The incident highlights the importance of using well-tested, reproducible analysis methods in scientific research.
This document provides an agenda for a spring school on bioinformatics and population genomics, including practical sessions on analyzing genomic data from reads to reference genomes and gene predictions in 6 steps: inspecting and cleaning reads, genome assembly, assessing assembly quality, predicting protein-coding genes, assessing gene prediction quality, and assessing the overall process quality using biological measures. It also addresses wifi issues that could reduce bandwidth and lists the VM password.
This document provides information about a spring school on bioinformatics and population genomics that includes practical sessions. The sessions will cover topics like short read cleaning, genome assembly, gene prediction, quality control, mapping reads to call variants, visualizing variants, analyzing variants through PCA and measuring diversity and differentiation, inferring population sizes and gene flow, and analyzing gene expression from raw sequencing data to expression levels. The document lists the team of practitioners leading the sessions and encourages participants to share their favorite software packages.
2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...Yannick Wurm
Brief (15min) talk I gave at #PopGroup49 in Edinburgh providing a few simple methods to reduce risk in genomics analyses.
Please cite: Avoid having to retract your genomics analysis (2015) Y Wurm. The Winnower 2, e143696.68941 https://thewinnower.com/papers/avoid-having-to-retract-your-genomics-analysis
This document contains information about programming in R, including practical examples. It discusses accessing and subsetting data, using regular expressions for text search, creating functions, and using loops. Examples are provided to demonstrate creating vectors, accessing subsets of vectors, using regular expressions to find patterns in text, creating functions to convert between units or estimate values, and using for loops to repeat operations over multiple elements. The document suggests R is useful for working with big data in biology and other fields due to its ability to automate tasks, integrate with other tools, and handle large datasets through programming.
This document describes oSwitch, a tool that allows easy access to other operating systems via one-line commands. It works by wrapping Docker containers, allowing commands to be run in different OS environments without disrupting the user's current environment. The document provides an example usage where a user is able to run an "abyss-pe" command in a Biolinux container after it is not found in their native OS. It notes how oSwitch aims to preserve the user's current working directory, login shell, home directory and file permissions during usage.
This document provides an outline for a lecture on the genetic basis of evolution. It begins with introducing key terms like gene, locus, allele, genotype, and phenotype. It then discusses genetic drift and how drift is influenced by population size. Selection is also introduced and defined as a process where individuals with different genotypes have different fitnesses. The document emphasizes that both genetic drift and selection influence evolution, and neither process should be overemphasized. It aims to move people away from only considering selection (pan-selectionism) and highlights the importance of genetic drift.
This document discusses human evolution and recent insights from genomics. It summarizes that Neanderthals were the closest evolutionary relatives to modern humans and lived in Europe and Western Asia until disappearing 30,000 years ago. A draft sequence of the Neanderthal genome from three individuals was presented, composed of over 4 billion nucleotides. Comparisons with five modern human genomes identified regions potentially affected by selection in ancestral modern humans, involving genes related to metabolism, cognition, and skeletal development. Analysis suggests Neanderthals shared more genetic variants with non-Africans, indicating gene flow from Neanderthals into their ancestors occurred before Eurasian groups diverged.
The document discusses analyzing ancient plant and insect DNA extracted from ice core samples in Greenland. Key points:
- Plant and insect DNA was recovered from silty ice samples taken between 2-3 km deep in the Dye 3 and JEG ice cores in Greenland, dating back to before the last glacial period.
- The DNA was identified as coming from tree species like pine and alder, indicating a boreal forest environment in southern Greenland at the time, rather than today's Arctic conditions.
- Other plant species identified include those from orders like Asterales, Poales, Rosales and Malpighiales. Insect DNA from Lepidoptera was also recovered.
1. The document discusses best practices for scientific software development, including writing code for people rather than computers, automating repetitive tasks, using version control, and conducting code reviews.
2. Specific approaches and tools recommended are planning for mistakes, automated testing, continuous integration, and using a coding style guide. R and Ruby style guides are provided as examples.
3. The benefits of following such practices are improving productivity, reducing errors, making code easier to read and maintain, and allowing scientists to focus on scientific questions rather than software issues. Reproducible and sustainable software is the overall goal.
This document provides an introduction to regular expressions (regex) for text search and pattern matching. It explains that regex allows for powerful text searches beyond simple keywords. Various special symbols and constructs are demonstrated that allow matching complex patterns and variants in text. Examples show matching names, sequences, microsatellite repeats and more with regex. Functions, loops and logical operators in R programming are also briefly covered.
The document discusses major geological drivers of evolution including tectonic plate movement, vulcanism, climate change, and meteorite impacts. Tectonic plate movement has caused continental drift and formation of supercontinents like Pangaea, affecting species distributions. Vulcanism causes both local and global climate changes through emission of gases and particles and formation of new land barriers and islands. Climate changes over geological timescales have also impacted evolution. Meteorite impacts have precipitated mass extinctions. These geological forces alter Earth's conditions and drive evolution through large-scale migrations, speciation events, mass extinctions, and adaptive radiations.
This document provides an overview and schedule for the course "SBC 361 Research Methods & Comms". The course is a mixture of advanced analytical skills taught in computer labs using the programming language R, and theoretical content covered in lectures and workshops. It includes two workshops on careers in science and popular science writing. Students will complete assignments involving the computer practicals and tutorials, and a mock exam. The schedule details the topics to be covered each week by different professors and teaching staff. It emphasizes the importance of attending classes, completing required work, and doing additional outside reading to succeed in the course.
This document discusses computational methods and challenges for genome assembly using next-generation sequencing data. It describes the four main stages of genome assembly as preprocessing filtering, graph construction, graph simplification, and postprocessing filtering. Each stage processes the data from the previous stage to build the assembly graph and reduce complexity, though some assemblers delay filtering steps.
The document summarizes research on the fire ant Solenopsis invicta. It describes how sequencing the genome of a fire ant revealed expansions in gene families related to odor detection and lipid processing. The genome sequencing found over 400 putative olfactory receptor genes, more than any other insect sequenced. It also discovered the fire ant has a functional DNA methylation system. Previous research had linked the fire ant's single or multiple queen social forms to a single gene called Gp-9, but the new genome sequencing allowed a more comprehensive analysis of genes near Gp-9.
This document summarizes research on social evolution in ants. It discusses various ant species like leaf-cutter ants, weaver ants, and Forelius pusillus ants. It also discusses fire ants and their single and multiple queen social forms which are associated with their Gp-9 gene. Genome sequencing of the fire ant revealed expansion of odor receptor genes and other findings. The document covers topics like kin selection, evolution of eusociality, and using modern technologies to study insect societies.
Keynote talk given at Fairdom User meeting http://fair-dom.org/communities/users/barcelona-2016-first-user-meeting/ .
I begin by summarising how we apply molecular approaches to understand social behaviour in ants. Subsequently, I give an overview of the data-handling challenges the genomic bioinformatics community faces. Finally, I give an overview of some of the tools and approaches my lab have developed to help us get things done better, faster, more reliably and more reproducibly.
The document discusses the genetic basis of social organization in fire ant populations. Researchers used RAD sequencing of haploid males to discover SNPs and genotype individuals at over 2,400 loci. Principal component analysis separated individuals into two clusters corresponding to their social form (single or multiple queen), with the first principal component explaining over 12% of the variance. A region on chromosome 13 containing the Gp-9 gene was completely associated with social form. This research identified a major gene influencing an important social trait using next-generation sequencing techniques.
50% social chromosomes in ants, 50% bioinformatics for genomics in emerging model organisms. Given at #CTBio http://pathogenomics.bham.ac.uk/blog/2013/07/cream-teas-and-bioinformatics-balti-and-bioinformatics-goes-on-its-holidays/
Apologies - videos and transitions are largely missing as part of the PDF conversion.
The work referenced here includes:
http://dx.doi.org/10.1038/nature11832
http://dx.doi.org/10.1073/pnas.1009690108
http://sequenceserver.com
https://github.com/yeban/afra & http://afra.sbcs.qmul.ac.uk
https://github.com/monicadragan/GeneValidator
The document appears to be about the genome of the red fire ant Solenopsis invicta. It summarizes some key findings from sequencing and analyzing the fire ant genome. These include an expansion of gene families related to lipid processing and cuticular hydrocarbons. It also found over 420 putative olfactory receptor genes, more than any other insect genome sequenced. Additionally, the genome appears to have a functional DNA methylation system.
This document discusses the experience of a researcher in genomics with applying FAIR and open approaches. It notes that making data and analysis methods FAIR and open can increase visibility, drive citations, and facilitate collaboration. However, it also enables competition to more easily access and utilize resources without contributing. Striking the right balance between openness and protecting competitive advantages is challenging. Overall, the researcher finds FAIR and open principles have greatly increased the impact and robustness of their work, but there are also costs to consider.
2018 08-reduce risks of genomics researchYannick Wurm
Geoffrey Chang, a protein crystallographer at The Scripps Research Institute, had his career trajectory disrupted when several of his high-profile papers describing protein structures had to be retracted. An in-house software program Chang's lab used to process diffraction data from protein crystals introduced a sign error that inverted the structures, invalidating biological interpretations. This included a 2001 Science paper describing the structure of the MsbA protein. A 2006 Nature paper by Swiss researchers casting doubt on Chang's MsbA structure led him to discover the software error. Chang and his co-authors sincerely regretted the confusion and unproductive research caused by the need to retract their influential papers.
Geoffrey Chang was a prominent structural biologist who received prestigious early career awards. However, his work came under scrutiny when other researchers discovered errors in his published protein structures due to a problem with his in-house data analysis software. This led Chang to retract 5 of his papers describing protein structures. The retractions were costly for Chang's career and reputation as well as for other researchers who had performed follow-up work based on the incorrect structures. The incident highlights the importance of using well-tested, reproducible analysis methods in scientific research.
This document provides an agenda for a spring school on bioinformatics and population genomics, including practical sessions on analyzing genomic data from reads to reference genomes and gene predictions in 6 steps: inspecting and cleaning reads, genome assembly, assessing assembly quality, predicting protein-coding genes, assessing gene prediction quality, and assessing the overall process quality using biological measures. It also addresses wifi issues that could reduce bandwidth and lists the VM password.
This document provides information about a spring school on bioinformatics and population genomics that includes practical sessions. The sessions will cover topics like short read cleaning, genome assembly, gene prediction, quality control, mapping reads to call variants, visualizing variants, analyzing variants through PCA and measuring diversity and differentiation, inferring population sizes and gene flow, and analyzing gene expression from raw sequencing data to expression levels. The document lists the team of practitioners leading the sessions and encourages participants to share their favorite software packages.
2015 12-18- Avoid having to retract your genomics analysis - Popgroup Reprodu...Yannick Wurm
Brief (15min) talk I gave at #PopGroup49 in Edinburgh providing a few simple methods to reduce risk in genomics analyses.
Please cite: Avoid having to retract your genomics analysis (2015) Y Wurm. The Winnower 2, e143696.68941 https://thewinnower.com/papers/avoid-having-to-retract-your-genomics-analysis
This document contains information about programming in R, including practical examples. It discusses accessing and subsetting data, using regular expressions for text search, creating functions, and using loops. Examples are provided to demonstrate creating vectors, accessing subsets of vectors, using regular expressions to find patterns in text, creating functions to convert between units or estimate values, and using for loops to repeat operations over multiple elements. The document suggests R is useful for working with big data in biology and other fields due to its ability to automate tasks, integrate with other tools, and handle large datasets through programming.
This document describes oSwitch, a tool that allows easy access to other operating systems via one-line commands. It works by wrapping Docker containers, allowing commands to be run in different OS environments without disrupting the user's current environment. The document provides an example usage where a user is able to run an "abyss-pe" command in a Biolinux container after it is not found in their native OS. It notes how oSwitch aims to preserve the user's current working directory, login shell, home directory and file permissions during usage.
This document provides an outline for a lecture on the genetic basis of evolution. It begins with introducing key terms like gene, locus, allele, genotype, and phenotype. It then discusses genetic drift and how drift is influenced by population size. Selection is also introduced and defined as a process where individuals with different genotypes have different fitnesses. The document emphasizes that both genetic drift and selection influence evolution, and neither process should be overemphasized. It aims to move people away from only considering selection (pan-selectionism) and highlights the importance of genetic drift.
This document discusses human evolution and recent insights from genomics. It summarizes that Neanderthals were the closest evolutionary relatives to modern humans and lived in Europe and Western Asia until disappearing 30,000 years ago. A draft sequence of the Neanderthal genome from three individuals was presented, composed of over 4 billion nucleotides. Comparisons with five modern human genomes identified regions potentially affected by selection in ancestral modern humans, involving genes related to metabolism, cognition, and skeletal development. Analysis suggests Neanderthals shared more genetic variants with non-Africans, indicating gene flow from Neanderthals into their ancestors occurred before Eurasian groups diverged.
The document discusses analyzing ancient plant and insect DNA extracted from ice core samples in Greenland. Key points:
- Plant and insect DNA was recovered from silty ice samples taken between 2-3 km deep in the Dye 3 and JEG ice cores in Greenland, dating back to before the last glacial period.
- The DNA was identified as coming from tree species like pine and alder, indicating a boreal forest environment in southern Greenland at the time, rather than today's Arctic conditions.
- Other plant species identified include those from orders like Asterales, Poales, Rosales and Malpighiales. Insect DNA from Lepidoptera was also recovered.
1. The document discusses best practices for scientific software development, including writing code for people rather than computers, automating repetitive tasks, using version control, and conducting code reviews.
2. Specific approaches and tools recommended are planning for mistakes, automated testing, continuous integration, and using a coding style guide. R and Ruby style guides are provided as examples.
3. The benefits of following such practices are improving productivity, reducing errors, making code easier to read and maintain, and allowing scientists to focus on scientific questions rather than software issues. Reproducible and sustainable software is the overall goal.
This document provides an introduction to regular expressions (regex) for text search and pattern matching. It explains that regex allows for powerful text searches beyond simple keywords. Various special symbols and constructs are demonstrated that allow matching complex patterns and variants in text. Examples show matching names, sequences, microsatellite repeats and more with regex. Functions, loops and logical operators in R programming are also briefly covered.
The document discusses major geological drivers of evolution including tectonic plate movement, vulcanism, climate change, and meteorite impacts. Tectonic plate movement has caused continental drift and formation of supercontinents like Pangaea, affecting species distributions. Vulcanism causes both local and global climate changes through emission of gases and particles and formation of new land barriers and islands. Climate changes over geological timescales have also impacted evolution. Meteorite impacts have precipitated mass extinctions. These geological forces alter Earth's conditions and drive evolution through large-scale migrations, speciation events, mass extinctions, and adaptive radiations.
This document provides an overview and schedule for the course "SBC 361 Research Methods & Comms". The course is a mixture of advanced analytical skills taught in computer labs using the programming language R, and theoretical content covered in lectures and workshops. It includes two workshops on careers in science and popular science writing. Students will complete assignments involving the computer practicals and tutorials, and a mock exam. The schedule details the topics to be covered each week by different professors and teaching staff. It emphasizes the importance of attending classes, completing required work, and doing additional outside reading to succeed in the course.
This document discusses computational methods and challenges for genome assembly using next-generation sequencing data. It describes the four main stages of genome assembly as preprocessing filtering, graph construction, graph simplification, and postprocessing filtering. Each stage processes the data from the previous stage to build the assembly graph and reduce complexity, though some assemblers delay filtering steps.
This document outlines the course SBC322 Ecological and Evolutionary Genomics. It discusses how new genomic technologies have changed ecology and evolution research by merging molecular and ecological approaches. It aims to critically evaluate research questions, methods, experimental designs and applications in ecological and evolutionary genomics. The course will improve students' skills in critically reading literature, understanding interdisciplinary science, and oral and written scientific communication through interactive small group work, informal and formal presentations, blog posts, and peer review.
The document provides an overview of topics covered in a bioinformatics course, including using Unix, bioinformatics algorithms, biological databases, sequencing technologies, and genome assembly and variant identification. It lists challenges for students in each topic area and provides examples of concepts that will be covered, such as using HPC systems, dynamic programming for sequence alignment, accessing databases like NCBI, processing sequencing data, and identifying variants from assembly. Images are included of different organisms like ants and sequencing technologies. The document aims to outline the scope and challenges of the bioinformatics course.
Sustainable software institute Collaboration workshopYannick Wurm
The document discusses tools for analyzing biological data. It summarizes four tools:
1. SequenceServer - A simple web interface for BLAST that handles formatting and installing BLAST locally.
2. oSwitch - Allows rapidly switching between operating systems and container environments to access specific bioinformatics software without installation.
3. GeneValidator - Helps curate gene predictions by identifying problematic predictions, choosing best alternative models, and aiding manual curation of individual genes.
4. Afra - A crowdsourcing platform that aims to crowdsource the visual inspection and correction of gene models by recruiting and training students, ensuring quality through tutorials, redundancy and senior review, and creating small, simple initial tasks.
This document provides an overview of genomic tools and best practices for scientific computing. It discusses SequenceServer, a tool for BLAST searches, and Bionode, a collection of Node.js modules for bioinformatics. It also discusses challenges with gene prediction and introduces GeneValidator, a tool for visual inspection and manual correction of gene predictions. Key points include automating repetitive tasks, writing code for people through style guides, and using version control and modularization to improve code quality and reproducibility.
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...Travis Hills MN
By harnessing the power of High Flux Vacuum Membrane Distillation, Travis Hills from MN envisions a future where clean and safe drinking water is accessible to all, regardless of geographical location or economic status.
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Sérgio Sacani
Context. The observation of several L-band emission sources in the S cluster has led to a rich discussion of their nature. However, a definitive answer to the classification of the dusty objects requires an explanation for the detection of compact Doppler-shifted Brγ emission. The ionized hydrogen in combination with the observation of mid-infrared L-band continuum emission suggests that most of these sources are embedded in a dusty envelope. These embedded sources are part of the S-cluster, and their relationship to the S-stars is still under debate. To date, the question of the origin of these two populations has been vague, although all explanations favor migration processes for the individual cluster members. Aims. This work revisits the S-cluster and its dusty members orbiting the supermassive black hole SgrA* on bound Keplerian orbits from a kinematic perspective. The aim is to explore the Keplerian parameters for patterns that might imply a nonrandom distribution of the sample. Additionally, various analytical aspects are considered to address the nature of the dusty sources. Methods. Based on the photometric analysis, we estimated the individual H−K and K−L colors for the source sample and compared the results to known cluster members. The classification revealed a noticeable contrast between the S-stars and the dusty sources. To fit the flux-density distribution, we utilized the radiative transfer code HYPERION and implemented a young stellar object Class I model. We obtained the position angle from the Keplerian fit results; additionally, we analyzed the distribution of the inclinations and the longitudes of the ascending node. Results. The colors of the dusty sources suggest a stellar nature consistent with the spectral energy distribution in the near and midinfrared domains. Furthermore, the evaporation timescales of dusty and gaseous clumps in the vicinity of SgrA* are much shorter ( 2yr) than the epochs covered by the observations (≈15yr). In addition to the strong evidence for the stellar classification of the D-sources, we also find a clear disk-like pattern following the arrangements of S-stars proposed in the literature. Furthermore, we find a global intrinsic inclination for all dusty sources of 60 ± 20◦, implying a common formation process. Conclusions. The pattern of the dusty sources manifested in the distribution of the position angles, inclinations, and longitudes of the ascending node strongly suggests two different scenarios: the main-sequence stars and the dusty stellar S-cluster sources share a common formation history or migrated with a similar formation channel in the vicinity of SgrA*. Alternatively, the gravitational influence of SgrA* in combination with a massive perturber, such as a putative intermediate mass black hole in the IRS 13 cluster, forces the dusty objects and S-stars to follow a particular orbital arrangement. Key words. stars: black holes– stars: formation– Galaxy: center– galaxies: star formation
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...Advanced-Concepts-Team
Presentation in the Science Coffee of the Advanced Concepts Team of the European Space Agency on the 07.06.2024.
Speaker: Diego Blas (IFAE/ICREA)
Title: Gravitational wave detection with orbital motion of Moon and artificial
Abstract:
In this talk I will describe some recent ideas to find gravitational waves from supermassive black holes or of primordial origin by studying their secular effect on the orbital motion of the Moon or satellites that are laser ranged.
When I was asked to give a companion lecture in support of ‘The Philosophy of Science’ (https://shorturl.at/4pUXz) I decided not to walk through the detail of the many methodologies in order of use. Instead, I chose to employ a long standing, and ongoing, scientific development as an exemplar. And so, I chose the ever evolving story of Thermodynamics as a scientific investigation at its best.
Conducted over a period of >200 years, Thermodynamics R&D, and application, benefitted from the highest levels of professionalism, collaboration, and technical thoroughness. New layers of application, methodology, and practice were made possible by the progressive advance of technology. In turn, this has seen measurement and modelling accuracy continually improved at a micro and macro level.
Perhaps most importantly, Thermodynamics rapidly became a primary tool in the advance of applied science/engineering/technology, spanning micro-tech, to aerospace and cosmology. I can think of no better a story to illustrate the breadth of scientific methodologies and applications at their best.
Mechanisms and Applications of Antiviral Neutralizing Antibodies - Creative B...Creative-Biolabs
Neutralizing antibodies, pivotal in immune defense, specifically bind and inhibit viral pathogens, thereby playing a crucial role in protecting against and mitigating infectious diseases. In this slide, we will introduce what antibodies and neutralizing antibodies are, the production and regulation of neutralizing antibodies, their mechanisms of action, classification and applications, as well as the challenges they face.
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfSelcen Ozturkcan
Ozturkcan, S., Berndt, A., & Angelakis, A. (2024). Mending clothing to support sustainable fashion. Presented at the 31st Annual Conference by the Consortium for International Marketing Research (CIMaR), 10-13 Jun 2024, University of Gävle, Sweden.
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Leonel Morgado
Current descriptions of immersive learning cases are often difficult or impossible to compare. This is due to a myriad of different options on what details to include, which aspects are relevant, and on the descriptive approaches employed. Also, these aspects often combine very specific details with more general guidelines or indicate intents and rationales without clarifying their implementation. In this paper we provide a method to describe immersive learning cases that is structured to enable comparisons, yet flexible enough to allow researchers and practitioners to decide which aspects to include. This method leverages a taxonomy that classifies educational aspects at three levels (uses, practices, and strategies) and then utilizes two frameworks, the Immersive Learning Brain and the Immersion Cube, to enable a structured description and interpretation of immersive learning cases. The method is then demonstrated on a published immersive learning case on training for wind turbine maintenance using virtual reality. Applying the method results in a structured artifact, the Immersive Learning Case Sheet, that tags the case with its proximal uses, practices, and strategies, and refines the free text case description to ensure that matching details are included. This contribution is thus a case description method in support of future comparative research of immersive learning cases. We then discuss how the resulting description and interpretation can be leveraged to change immersion learning cases, by enriching them (considering low-effort changes or additions) or innovating (exploring more challenging avenues of transformation). The method holds significant promise to support better-grounded research in immersive learning.
The cost of acquiring information by natural selectionCarl Bergstrom
This is a short talk that I gave at the Banff International Research Station workshop on Modeling and Theory in Population Biology. The idea is to try to understand how the burden of natural selection relates to the amount of information that selection puts into the genome.
It's based on the first part of this research paper:
The cost of information acquisition by natural selection
Ryan Seamus McGee, Olivia Kosterlitz, Artem Kaznatcheev, Benjamin Kerr, Carl T. Bergstrom
bioRxiv 2022.07.02.498577; doi: https://doi.org/10.1101/2022.07.02.498577
23. Downloaded from www.sciencemag.org on March 12, 2013 Solenopsis invicta fire ants are
REPORTS
a big problem!
very well studied!
Ascunce et al 2011
24. Solenopsis invicta fire ant:
two social forms
Single-queen form: Multiple-queen form:
•1 large queen
•Independent founding
•Highly territorial
•Many sizes of workers
•2-100 smaller queens
•Dependent founding
•No inter-colony aggression
•All workers similar size
25. L. Keller
Population genetics: Allozyme screen
Fire ants
+
Ken Ross
+ “starch gel”
1 2 3
26. Allozyme screen Social form associated to Gp-9 locus
Frequency of
the most
common allele
Ddh-1!
Pro-5!
Locus!
1.0!
0.9!
0.8!
0.7!
0.6!
0.5!
0.4!
0.3!
Single queen!
Multiple queen!
Est-4!
G3pdh-1! Ca-4!
Est-6!
Pgm-4!
Acy-1!
Pgm-1!
acoh-1!
Pgm-3!
Acoh-5!
Aat-2!
Gp-9!
Ken Ross and colleagues
Laurent Keller and colleagues
27. Social form completely associated to Gp-9 locus
Single queen form Multiple queen form
Ken Ross and colleagues
Laurent Keller and colleagues
28. Social form completely associated to Gp-9 locus
Single queen form Multiple queen form
(< 5% ) (>15% )
BB BB Bb bb
Ken Ross and colleagues
Laurent Keller and colleagues
29. Social form completely associated to Gp-9 locus
Single queen form Multiple queen form
(< 5% ) (>15% )
x
BB BB Bb bb
Gp-9 bb females rare
Ken Ross and colleagues
Laurent Keller and colleagues
30. Social form completely associated to Gp-9 locus
Single queen form Multiple queen form
(< 5% ) (>15% )
BB BB Bb
Ken Ross and colleagues
Laurent Keller and colleagues
31. Social form completely associated to Gp-9 locus
Single queen form Multiple queen form
(< 5% ) (>15% )
BB BB Bb
x
Ken Ross and colleagues
Laurent Keller and colleagues
32. Social form completely associated to Gp-9 locus
Single queen form Multiple queen form
(< 5% ) (>15% )
BB BB Bb
x x
Ken Ross and colleagues
Laurent Keller and colleagues
33. Social form completely associated to Gp-9 locus
Single queen form Multiple queen form
(< 5% ) (>15% )
BB BB Bb
x x x
Ken Ross and colleagues
Laurent Keller and colleagues
34. Social form completely associated to Gp-9 locus
• Is this gene the single überregulator?
35. Social form completely associated to Gp-9 locus
• Is this gene the single überregulator?
•Only 14 allozyme markers were used
maybe 1/14th of the genome?
Ddh-1!
Pro-5!
Locus!
1.0!
0.9!
0.8!
0.7!
0.6!
0.5!
0.4!
0.3!
Single queen!
Multiple queen!
Est-4!
G3pdh-1! Ca-4!
Est-6!
Pgm-4!
Acy-1!
Pgm-1!
acoh-1!
Pgm-3!
Acoh-5!
Aat-2!
Gp-9!
36. This changes
454 everything.
Illumina
Solid...
Any lab can
sequence
anything!
37.
38. Are other genes linked to Gp-9?
Sequenced:
•Genome
of a Gp-9 B ♂
Sequencing from haploid males (for easier assembly):
45× (330bp-insert paired reads) + (normal single-Single ♂:
His brothers:
B 20x
11×
4×
(8,000 & 20,000bp-insert paired reads)
41. Deuterostomia 173
Nematoda 25
Cnidaria 100
Not assigned 274
★ Expansion of lipid-processing gene families (for Cuticular Hydrocarbons)
420 putative olfactory receptors 3
SiOR03038
SiOR04609+SiOR06843+1 1
SiOR06723+12
★ (more than any other insect!)
SiOR04648+★ Functional DNA-methylation system
SiOR00899+6 SiOR02694+4
★Ant-specific duplication and subfunctionalization
of vitellogenin (in bees: involved in reproduction & division of labor)
SiOR00899+8 SiOR04648+7
SiOR04648+6
SiOR04171+17
SiOR04171+29
SiOR04171+14
SiOR04609+5
SiOR01321
SiOR04609+19
SiOR00899+12 SiOR05901+1
SiOR04171+3
12 SiOR01224+SiOR04510+SiOR04510+16 13
SiOR04171+25 SiOR06577
SiOR04171+24
SiOR01629+3 SiOR01968+26
SiOR04171+21 SiOR06792+6
SiOR02883+2
SiOR05431+SiOR01858+1 1
SiOR05431+4
SiOR04510+7 SiOR01968+21
SiOR05431+3 SiOR04510+6
SiOR01629+1
SiOR01968+7 SiOR01629+6
SiOR05285+2
SiOR03663
SiOR00899+13
Wurm et al 2011
significance of these duplication events in vitellogenins, odor
perception genes, and a family of lipid-processing genes. We also
discuss additional features of interest in the fire ant genome rel-evant
to the complex social biology of this species, including sex
determination genes, DNA methylation genes, telomerase, and
the insulin and juvenile hormone pathways.
Vitellogenins. In contrast to other insects that mainly have only one
or two vitellogenins, the fire ant genome harbors four adjacent
after duplication to acquire caste-specific functions.
Odor Perception. Consistent with studies in other insects, we find
a single S. invicta ortholog to DmOr83b, a broadly expressed ol-factory
receptor (OR) required to interact with other ORs for
Drosophila and Tribolium castaneum olfaction (30–32). Beyond
OR83b, OR number varies greatly between insect species. Blast
searches and GeneWise searches using an HMM profile con-structed
with aligned ORs from N. vitripennis (33) and Pogono-myrmex
barbatus identified more than 400 loci in the S. invicta
genome with significant sequence similarity to ORs. Preliminary
work on gene model reconstruction identified 297 intact full-length
proteins. Many S. invicta ORs are in tandem arrays (Fig.
S2A) and derive from recent expansions. S. invicta may thus har-bor
the largest identified insect OR repertoire because there are
10 ORs in Pediculus humanus (34), 60 in Drosophila, 165 in
A. mellifera, 225 in N. vitripennis (33), and 259 in T. castaneum
(32). The large numbers of N. vitripennis and T. castaneum ORs
are thought to be due to current or past difficulties in host and
food finding. As has been suggested for A. mellifera (35), the large
number of S. invicta ORs may result from the importance of
chemical communication in ants. The odorant-binding proteins
(OBPs) are another family of genes also known to play roles in
chemosensation in Drosophila (36). Intriguingly, the social orga-nization
of S. invicta colonies is completely associated with se-
No hits 3424
Fig. 2. Taxonomic distribution of best blastp hits of S. invicta proteins to the
nonredundant (nr) protein database (E < 10−5). Results were first plotted
using MEGAN software (22) and then branches with fewer than 20 hits were
removed, branch lengths were reduced for compactness, and tree topology
was adjusted to reflect consensus phylogenies (23, 24).
2,330,000 bp 2,360,000 bp A
Vg4 Vg1 Vg3 Vg2
B Solenopsis Vg1 C
Solenopsis Vg4
Solenopsis Vg2
Solenopsis Vg3
Apis Vg
Bombus Vg
Nasonia Vg1
Pteromalus Vg
Nasonia Vg2
Encarsia Vg
Pimpla Vg
Athalia Vg
Apocrita
Tenthedinoidea
Vespoidea
Apoidea
Aculeata
Chalcidoidea
25000 Vg2 Vg3
20000
15000
10000
5000
Vg1 Vg4
* ***
600
500
400
300
200
100
Ichneumonoidea 0
*** ***
Q W Q W Q W Q W
142 389 1 40 17820 1.4 9269 0.6
0
Fig. 3. S. invicta vitellogenins. (A) Four vitellogenins are located within a single 40,000-bp region of the S. invicta genome. (B) Parsimony tree of known hy-menopteran
EVOLUTION
0.05
SiOR04648+10
SiOR01968+4
SiOR00899+7
SiOR02814+3
SiOR04171+6
SiOR04609+4
SiOR00330+28
SiOR02694+25
SiOR04609+20
SiOR05285+6
25
SiOR04510+15
SiOR00330+18 SiOR04609+23
SiOR01968+23
SiOR03952+4
SiOR04648+16
SiOR05901+2
SiOR02944+4
SiOR01968+5
SiOR04171+19
SiOR04648+5
SiOR10535+3
SiOR06723+2
SiOR01968+9
SiOR02883+1
SiOR00899+3
SiOR04171+1
SiOR01629+11
SiOR04171+10
SiOR04171+13
SiOR02694+3
SiOR04171+20
SiOR02694+35
SiOR04171+15
SiOR04609+7
SiOR05118+2
SiOR07837+2
SiOR02694+27
SiOR01968+10
SiOR04648+17
SiOR01968+19
SiOR02694+17
13
SiOR01968+6
SiOR00330+20
SiOR02648+2
SiOR02659+2
SiOR01968+16
SiOR00899+11
SiOR02974
SiOR04171+2
SiOR03952+2
SiOR06792+2
SiOR04510+4
SiOR04171+28
SiOR05285+5
9 SiOR00899+15 SiOR04648+3
SiOR02694+36
SiOR10535+1
SiOR02694+19
SiOR02694+23
SiOR02694+1
SiOR04609+14
SiOR01122
9
SiOR02694+34
SiOR01629+8
SiOR04648+8
SiOR04510+8
SiOR06573
SiOR02944+1
26
SiOR00330+1
SiOR02694+15
SiOR05285+7
SiOR00899+5
SiOR04609+10
SiOR04609+3 SiOR04339
SiOR08068
SiOR04510+2
SiOR05285+8
SiOR01573+4
SiOR04171+8
SiOR01858+2 SiOR01968+2
SiOR01968+1
SiOR02694+5
SiOR01968+3
SiOR06723+3
SiOR01968+15
SiOR05285+1
SiOR00899+4
SiOR04609+22
SiOR04171+9
SiOR02694+9 SiOR02648+1
SiOR06792+3
SiOR01573+2
SiOR02694+20
SiOR10542
SiOR04609+15
SiOR02694+8
SiOR00330+16
SiOR00899+2
SiOR02694+10
SiOR04510+9
SiOR05285+3
SiOR04609+2
SiOR05285+11 SiOR02694+14
SiOR01573+1
SiOR00613
SiOR01968+22
SiOR00899+9
SiOR06843+2
SiOR02694+37
SiOR00899+1
SiOR04609+9
SiOR05431+2 SiOR10535+2
SiOR00330+15
SiOR02694+18
SiOR01224+2
SiOR04510+11
23
SiOR02694+29
SiOR05416
SiOR05285+10 SiOR02694+2
SiOR01629+9
SiOR08341 SiOR02694+22
SiOR01224+1
SiOR01968+12
SiOR02694+7
SiOR02944+2
SiOR03952+3
SiOR01968+8
SiOR04609+24
SiOR02694+30
SiOR01629+10
SiOR04510+14
SiOR00565 SiOR05118+3
SiOR00330+14
SiOR02694+38 SiOR04609+8
SiOR04171+16
SiOR10455
SiOR04609+16
SiOR04609+21
SiOR02694+28 SiOR02659+1
SiOR04171+5 SiOR00330+29
SiOR01968+14
SiOR03983
SiOR00330+27
SiOR05285+4
SiOR04510+1
SiOR04609+17 SiOR00330+5
SiOR02694+21
SiOR02814+4
SiOR00330+7
SiOR02694+31
SiOR04648+2
SiOR02694+39
SiOR01968+25
SiOR04609+11
SiOR02694+11
SiOR06792+1
SiOR04171+4
SiOR01629+5
SiOR00330+21
SiOR04648+15
SiOR00330+6SiOR02694+16
11
SiOR04648+4
SiOR00330+3
SiOR06535
SiOR04171+7
SiOR10493
SiOR02694+32
SiOR06792+4
SiOR04510+3
SiOR06890
SiOR01968+20
SiOR04609+12
SiOR04171+3
SiOR01968+18
SiOR01968+11
SiOR04609+13
SiOR01629+12
SiOR00330+22
SiOR02694+33
SiOR00330+13
SiOR01573+3
SiOR05118+1
SiOR02944+3
SiOR04171+26
SiOR00899+14
SiOR02694+13
SiOR00330+24
SiOR00330+19
SiOR04171+27
SiOR02694+24
SiOR04510+5
SiOR07090
SiOR03952+1
SiOR04510+10
SiOR00330+17
SiOR02694+26
SiOR02814+2
SiOR00330+11
SiOR04171+18
SiOR01968+17
SiOR00330+10
SiOR00330+9
SiOR01629+2
SiOR04171+11
SiOR04510+12
SiOR00330+8
SiOR02694+6
SiOR01968+13
SiOR00330+4 SiOR04609+18
SiOR00899+10
SiOR00330+12
SiOR00330+31
SiOR06843+1
SiOR07837+1
SiOR00330+2
SiOR01629+4
SiOR04648+1
SiOR01968+24
SiOR04171+23
SiOR01629+7 SiOR04648+14
SiOR06792+5
SiOR02883+3
SiOR02694+12
SiOR05118+4
SiOR04171+22
SiOR01080 SiOR04609+6
SiOR02814+1
SiOR00330+30
SiOR05285+12
Genome of a Gp-9 B ♂ fire ant
Wurm et al 2011
42. Are other genes linked to Gp-9?
Social form completely associated to Gp-9 locus
Single queen form Multiple queen form
(< 5% ) (>15% )
BB BB Bb
x x x
43. Are other genes linked to Gp-9?
Sequenced:
•Genome of a Gp-9 B ♂
•Genome of a Gp-9 b ♂
RAD sequencing
“Next Generation Genotyping.”
44. RAD sequencing
“Next Generation Genotyping.”
Bb
unfertilised eggs
haploid ♂
Gp-9 B Gp-9 b Gp-9 B Gp-9 b Gp-9 b Gp-9 B
38 B♂ & 38 b♂
45. RAD sequencing of haploid ♂ for SNP
EcoR1 EcoR1 EcoR1
Gp-9 B
discovery & genotyping
46. RAD sequencing of haploid ♂ for SNP
EcoR1 EcoR1 EcoR1
AACTG
AACTG
AACTG
AACTG
Gp-9 B
discovery & genotyping
Gp-9 B
47. RAD sequencing of haploid ♂ for SNP
Gp-9 B
AACTG
Gp-9 B
Gp-9 B
GGCCT
Gp-9 B
Gp-9 B
AAGGT
Gp-9 B
Gp-9 b
CCAGT
Gp-9 b
Gp-9 b
TAAAT
Gp-9 b
Gp-9 b
GGAAT
Gp-9 b
38 Gp-9 B
males
38 Gp-9 b
males
discovery & genotyping
48. RADseq: sequencing the same 0.01% of the
genome in many individuals
Identify polymorphism
individual x locus
genotype table
A B C D E F
L1 A C A A C C
L2 G G T - T G
L3 - A G A - G
L4 C - - G G C
L5 T T C T C -
L6 G A A - - G
2419 loci
38 B♂ & 38 b♂
PCA: Principal Component Analysis
Amount of variance explained per principal component
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20+
% Variance Explained
0 5 10 15 20 25 30
12.7%
6.1% 5.4% 4.8% 4.7% 3.9% 3.5% 3.2% 3.1% 2.9% 2.8% 2.6% 2.4% 2.3% 2.2% 2.0% 1.9% 1.7% 1.6%
30.2%
49. Principal Components: PC2 vs PC3
pc: 2 % variance: 6.073
pc: 3 % variance: 5.441
0.2
0.1
0.0
-0.1
-0.2
-0.2 -0.1 0.0 0.1 0.2
Gp-9 B ♂
Gp-9 b ♂
50. Principal Components: PC1 vs PC2
pc: 1 % variance: 12.666
pc: 2 % variance: 6.073
0.2
0.1
0.0
-0.1
-0.2
-0.10 -0.05 0.00 0.05 0.10 0.15
Gp-9 B ♂
Gp-9 b ♂
52. Sex chromosomes
X Y
“Social chromosomes”
Gp-9 B
Gp-9 b
?
SB Sb
Why non-recombining?
53. Structural differences between B and b
likely inhibit recombination
Small portion (200,000bp) of social chromosome
SB
Sb
54. Structural differences between B and b
likely inhibit recombination
SB Gp-9B
genetic map
Gp-9 B ♂ Gp-9 b ♂
A22
E17
Gp-9 B male SB a Gp-9 B male A22
Gp-9 b male
a Gp-9 B male Gp-9 b male
Sb
Gp-9B
genetic map
A22
A22
E17
E17
E3
SB Sb
Gp-9B
genetic map
E3
SB Sb
Gp-9B
genetic map
E17
E3
Gp-9 b male
Flourescence in situ
Hybridization
John Wang @ Taipei
56. X
♀
♂
X X Y
Single queen colony Multiple queen colony
>10
rearrangements
•Prediction:
SB SB SB Sb
Differences between SB and Sb?
•genes?
Region contains 800 genes! only small differences
⟹ Prediction: directional (antagonistic?) selection?
Sb is degenerating?
57. • More, larger repetitive DNA in Sb compared to SB
• larger introns in Sb
• larger intergenic regions in Sb
• assembly worse (smaller scaffolds) in Sb
• increased dN/dS
SB
Sb
[a] vs. [c]: p < 10-7
[b] vs. [c]: p < 10-4
Gp-9B male Gp-9b male
6,000,000
5,000,000
4,000,000
3,000,000
2,000,000
1,000,000
Region:
Genome assembly:
Normally recombining
regions from all 16
linkage groups
Normally recombining
regions from all 16
linkage groups
Sb region without
recombination
in Gp-9 Bb queens
SB region without
recombination
in Gp-9 Bb queens
Scaffold length (bp)
0
[a] [a], [b] [a] [c]
SB Sb
Sb is degenerating?
58. X
♀
♂
X X Y
Single queen colony Multiple queen colony
Likely several
rearrangements
•Prediction:
SB SB SB Sb
Differences between SB and Sb?
•genes?
Region contains 800 genes! only small differences
⟹ Prediction: directional (antagonistic?) selection?
Sb is degenerating?
•Probably ♂ haploidy: strong purifying selection
⟹ slower degeneration
59. Formica selysi
Alpine silver ant
Common ancestor with fire ant: 130 MYA
Purcell et al 2014 Convergent social chromosome architecture
J Meunier
Single queen form Multiple queen form
60. ≠ social chromosomes
Purcell et al 2014
Solenopsis
invicta social
chromosome
Formica selysi
social
chromosome
62. Summary
Ants are cool.
Fire ant Solenopsis invicta queen number determined by Gp-9 genotypes:
•only BB workers ➔ single BB queen
•with Bb workers ➔ multiple Bb queens
Genome sequencing + RAD Genotyping >500 individuals
•Gp-9 marks ~4% of genome ➔ social chromosomes:
SB is like X; Sb is like Y
Structural differences between SB and Sb ➔ no recombination
Formica selysi: Convergent evolution of social chromosomes
63. Generally
genome evolution social evolution
Major transitions:
Single- vs. Multiple queenness
Eusociality
Social parasitism
Strengths of selection in
social evolution
concepts & mechanisms
Candidate gene studies
Vitellogenin
Sex determination genes
Caste differentiation (Medically
relevant)
Pollinator health
Tools for genomics work on emerging model organisms
82. GeneValidator
Run on:
★ whole geneset: identify most problematic predictions
★ alternative models for a gene (choose best)
★ individual genes (while manually curating)
90. Timelines
• Rolled out to:
• 8 MSc students
• 20 3rd year students
• Need to improve tutorials/guidance/documentation
• Roll out to 200 first years (few months)
• Expand
@yebanAnurag Priyam
91. Summary
Ants are cool.
Fire ant Solenopsis invicta queen number determined by Gp-9 genotypes:
•only BB workers ➔ single BB queen
•with Bb workers ➔ multiple Bb queens
Genome sequencing + RAD Genotyping >500 individuals
•Gp-9 marks ~4% of genome ➔ social chromosomes:
SB is like X; Sb is like Y
Structural differences between SB and Sb ➔ no recombination
Formica selysi: Convergent evolution of social chromosomes
A few tools for genomics:
• http://sequenceserver.com (easy custom BLAST server)
• http://bionode.io (Agile data & analysis workflows)
• http://afra.sbcs.qmul.ac.uk (Crowdsourcing gene feature curation)
• http://genevalidator.sbcs.qmul.ac.uk Identifying gene prediction
problems
92. QMUL lab (R Pracana, B Vieira, E
Stolle, A Priyam, I Moghul et al)
Laurent Keller lab (J WANG, D
SHOEMAKER,O RIBA-GROGNUZ,
M Nipitwattanaphon)
Michel Chapuisat, Jess Purcell,
Alan Breslford, Nicolas Perrin
Ecology & Evolution & Vital-IT
@ Lausanne
Organismal Biology & Psych
@ Queen Mary
http://wurmlab.github.io
M Corona, S Nygaard, BG Hunt, KK Ingram, L
Falquet, M Nipitwattanaphon, D Gotzek, MB Dijkstra,
J Oettler, F Comtesse, CJ Shih, WJ Wu, CC Yang, J
Thomas, E Beaudoing, S Pradervand, V Flegel, ED
Cook, R Fabbretti, H Stockinger, L Long, WG
Farmerie, J Oakey, JJ Boomsma, P Pamilo, SV Yi, J
Heinze, MAD Goodisman, L Farinelli, K Harshman, N
Hulo, L Cerutti, Ioannis Xenarios
93. Age of the region based on dS
250
250
200
200
150
150
100
100
50
50
0
0
leafcutterdS
0.00 0.05 0.10 0.15 0.20 leafcutterDndsSubset$dS
count
leafcutterdS
0.00 0.05 0.10 0.15 0.20 leafcutterDndsSubset$dS
Leafcutter common ancestor: 8,000,000-10,000,000 years ago
150
150
100
count
gp9linkedSolenopsisdS
count
100
0.00 0.05 0.10 0.15 0.20 subset(dndsdata, gp9linked == TRUE)$dS count
50
50
0
0.00 0.05 0.10 0.15 0.20 subset(dndsdata, gp9linked == TRUE)$dS gp9linkedSolenopsisdS
0
Maximum Likelihood Estimation of SB/Sb age:280,000-425,000
⟹ little time for degeneration
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
96. Most BB vs Bb gene expression
differences map to S
Non-recombing region of S contains 800 genes
Gene Expression Patterns for a Social Trait
Gene expression: Gp-9 Bb vs BB workers in multiple queen colonies
29 sign i fi c a n t genes
are in the SB/Sb region
(p<10-10)
20 of
Similar for BB vs Bb queens; &
for B vs b males. Wang et al 2008
97. Algorithm discovery by protein folding game players
Firas Khatiba, Seth Cooperb, Michael D. Tykaa, Kefan Xub, Ilya Makedonb, Zoran Popovićb,
David Bakera,c,1, and Foldit Players
aDepartment of Biochemistry; bDepartment of Computer Science and Engineering; and cHoward Hughes Medical Institute, University of Washington,
Box 357370, Seattle, WA 98195
Contributed by David Baker, October 5, 2011 (sent for review June 29, 2011)
Foldit is a multiplayer online game in which players collaborate
and compete to create accurate protein structure models. For spe-cific
hard problems, Foldit player solutions can in some cases out-perform
state-of-the-art computational methods. However, very
little is known about how collaborative gameplay produces these
results and whether Foldit player strategies can be formalized and
structured so that they can be used by computers. To determine
whether high performing player strategies could be collectively
codified, we augmented the Foldit gameplay mechanics with tools
for players to encode their folding strategies as “recipes” and to
share their recipes with other players, who are able to further mod-ify
and redistribute them. Here we describe the rapid social evolu-tion
of player-developed folding algorithms that took place in the
year following the introduction of these tools. Players developed
over 5,400 different recipes, both by creating new algorithms and
by modifying and recombining successful recipes developed by
other players. The most successful recipes rapidly spread through
the Foldit player population, and two of the recipes became parti-cularly
dominant. Examination of the algorithms encoded in these
two recipes revealed a striking similarity to an unpublished algo-rithm
developed by scientists over the same period. Benchmark
calculations show that the new algorithm independently discov-ered
by scientists and by Foldit players outperforms previously
published methods. Thus, online scientific game frameworks have
the potential not only to solve hard scientific problems, but also to
discover and formalize effective new strategies and algorithms.
citizen science ∣ crowd-sourcing ∣ optimization ∣ structure prediction ∣
strategy
Citizen science is an approach to leveraging natural human
abilities for scientific purposes. Most such efforts involve
visual tasks such as tagging images or locating image features
(1–3). In contrast, Foldit is a multiplayer online scientific discovery
game, in which players become highly skilled at creating accurate
protein structure models through extended game play (4, 5). Foldit
recruits online gamers to optimize the computed Rosetta energy
using human spatial problem-solving skills. Players manipulate
protein structures with a palette of interactive tools and manipula-tions.
Through their interactive exploration Foldit players also uti-lize
user-friendly versions of algorithms from the Rosetta structure
prediction methodology (6) such as wiggle (gradient-based energy
minimization) and shake (combinatorial side chain rotamer pack-ing).
The potential of gamers to solve more complex scientific pro-blems
was recently highlighted by the solution of a long-standing
protein structure determination problem by Foldit players (7).
One of the key strengths of game-based human problem ex-ploration
is the human ability to search over the space of possible
strategies and adapt those strategies to the type of problem and
stage of problem solving (5). The variability of tactics and
strategies stems from the individuality of each player as well as
multiple methods of sharing and evolution within the game
(group play, game chat), and outside of the game [wiki pages (8)].
One way to arrive at algorithmic methods underlying successful
human Foldit play would be to apply machine learning techniques
to the detailed logs of expert Foldit players (9). We chose instead
to rely on a superior learning machine: Foldit players themselves.
As the players themselves understand their strategies better than
anyone, we decided to allow them to codify their algorithms
directly, rather than attempting to automatically learn approxi-mations.
We augmented standard Foldit play with the ability to
create, edit, share, and rate gameplay macros, referred to as
“recipes” within the Foldit game (10). In the game each player
has their own “cookbook” of such recipes, from which they can
invoke a variety of interactive automated strategies. Players can
share recipes they write with the rest of the Foldit community or
they can choose to keep their creations to themselves.
In this paper we describe the quite unexpected evolution of
recipes in the year after they were released, and the striking con-vergence
of this very short evolution on an algorithm very similar
to an unpublished algorithm recently developed independently
by scientific experts that improves over previous methods.
Results
In the social development environment provided by Foldit,
players evolved a wide variety of recipes to codify their diverse
strategies to problem solving. During the three and a half month
study period (see Materials and Methods), 721 Foldit players ran
5,488 unique recipes 158,682 times and 568 players wrote 5,202
recipes. We studied these algorithms and found that they fell
into four main categories: (i) perturb and minimize, (ii) aggressive
rebuilding, (iii) local optimize, and (iv) set constraints. The first
category goes beyond the deterministic minimize function
provided to Foldit players, which has the disadvantage of readily
being trapped in local minima, by adding in perturbations to lead
the minimizer in different directions (11). The second category
uses the rebuild tool, which performs fragment insertion with
loop closure, to search different areas of conformation space;
these recipes are often run for long periods of time as they are
designed to rebuild entire regions of a protein rather than just
refining them (Fig. S1). The third category of recipes performs
local minimizations along the protein backbone in order to im-prove
the Rosetta energy for every segment of a protein. The final
category of recipes assigns constraints between beta strands or
pairs of residues (rubber bands), or changes the secondary struc-ture
assignment to guide subsequent optimization.
Different algorithms were used with very different frequencies
during the experiment. Some are designated by the authors as
public and are available for use by all Foldit players, whereas
others are private and available only to their creator or their
Foldit team. The distribution of recipe usage among different
players is shown in Fig. 1 for the 26 recipes that were run over
1,000 times. Some recipes, such as the one represented by the
leftmost bar, were used many times by many different players,
while others, such as the one represented by the pink bar in the
Author contributions: F.K., S.C., Z.P., and D.B. designed research; F.K., S.C., M.D.T., and
F.P. performed research; F.K., S.C., M.D.T., K.X., and I.M. analyzed data; and F.K., S.C., Z.P.,
and D.B. wrote the paper.
The authors declare no conflict of interest.
Freely available online through the PNAS open access option.
1To whom correspondence should be addressed. E-mail: dabaker@u.washington.edu.
This article contains supporting information online at www.pnas.org/lookup/suppl/
doi:10.1073/pnas.1115898108/-/DCSupplemental.
BIOPHYSICS AND
COMPUTATIONAL BIOLOGY
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
http://Fold.it
98. Research themes
Social evolution
Pollinator health
• Biomedical approaches
• International population genomics surveys
• Monitoring via sequencing
• Responses to environmental challenges
Modern Bioinformatics for Genomics
• Reproducibility
• Accuracy
• Sustainability
• Versioning
• Open source
• Agile & efficient
data handling
• Major social transitions
» social chromosomes
» convergence
» eusociality, queen number, parasitism...
• 100-fold intra-specific variation in lifespan
• Strengths of selection
• Candidate genes/pathway
99.
100. Gp-9 is an odorant binding protein
Hypothesis: influences queen odor & how workers « smell » queens
Krieger & Ross