A palestra descreve a área de Ciência de Dados e dá exemplos de diversas aplicações multi-modelos (tabelas, texto e grafos) e multi-disciplinares (biologia, enfermagem, educação).
Supervised Multi Attribute Gene Manipulation For Cancerpaperpublications3
Abstract: Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Data mining tools predict future trends and behaviours, allowing businesses to make proactive, knowledge-driven decisions. The automated, prospective analyses offered by data mining move beyond the analyses of past events provided by retrospective tools typical of decision support systems.
They scour databases for hidden patterns, finding predictive information that experts may miss because it lies outside their expectations. Data mining techniques are the result of a long process of research and product development. This evolution began when business data was first stored on computers, continued with improvements in data access, and more recently, generated technologies that allow users to navigate through their data in real time. Data mining takes this evolutionary process beyond retrospective data access and navigation to prospective and proactive information delivery.
Presentation on CIAT's IABIN tools project on threats to biodiversity in Latin America, presented in Costa Rica in February 2011. See http://dapa.ciat.cgiar.org for more information.
Zuur et al 2010 methods in ecology and evolution a protocol for data explorat...Lisiane Zanella
This document provides a protocol for data exploration to avoid common statistical problems when analyzing ecological data. It discusses exploring data for outliers, heterogeneity, collinearity, dependence, and other issues. The protocol aims to identify potential problems before statistical analysis to reduce type I and II errors and ensure robust conclusions. Data exploration is presented as an essential first step, taking up to 50% of analysis time. Graphical tools are emphasized over tests for exploring data visually and identifying issues to address. The document provides examples and discusses handling outliers and other problems when they arise.
DataONE Education Module 01: Why Data Management?DataONE
Lesson 1 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
Sustainable Development Indicators & Metricsgaiametrics-sr
John O'Connor opened remarks at the Bibliotheca Alexandrina by discussing frameworks for sustainable development and indicators to monitor progress. He covered topics such as capital stocks, multifactor productivity, intangible assets, and the need for concise indicator sets to track changes in access to resources for current and future generations. O'Connor advocated for overhauling information systems using modern technologies through public-private partnerships to support sustainable development goals.
IABIN Threat Assessment Project Presentation (Costa Rica) by Andy Jarvis from...Hector
The document discusses improving biodiversity data quality for South America. It describes assessing occurrence records from three databases to identify reliable coordinates, develop scripts for automated data cleaning, and georeference additional records. Approximately 19,000 species from 3,900 genera were modeled to analyze threats from accessibility, deforestation, and fires. Conservation status was evaluated by calculating protected areas within species ranges. A web-based tool to visualize the results is under development.
Supervised Multi Attribute Gene Manipulation For Cancerpaperpublications3
Abstract: Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Data mining tools predict future trends and behaviours, allowing businesses to make proactive, knowledge-driven decisions. The automated, prospective analyses offered by data mining move beyond the analyses of past events provided by retrospective tools typical of decision support systems.
They scour databases for hidden patterns, finding predictive information that experts may miss because it lies outside their expectations. Data mining techniques are the result of a long process of research and product development. This evolution began when business data was first stored on computers, continued with improvements in data access, and more recently, generated technologies that allow users to navigate through their data in real time. Data mining takes this evolutionary process beyond retrospective data access and navigation to prospective and proactive information delivery.
Presentation on CIAT's IABIN tools project on threats to biodiversity in Latin America, presented in Costa Rica in February 2011. See http://dapa.ciat.cgiar.org for more information.
Zuur et al 2010 methods in ecology and evolution a protocol for data explorat...Lisiane Zanella
This document provides a protocol for data exploration to avoid common statistical problems when analyzing ecological data. It discusses exploring data for outliers, heterogeneity, collinearity, dependence, and other issues. The protocol aims to identify potential problems before statistical analysis to reduce type I and II errors and ensure robust conclusions. Data exploration is presented as an essential first step, taking up to 50% of analysis time. Graphical tools are emphasized over tests for exploring data visually and identifying issues to address. The document provides examples and discusses handling outliers and other problems when they arise.
DataONE Education Module 01: Why Data Management?DataONE
Lesson 1 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
Sustainable Development Indicators & Metricsgaiametrics-sr
John O'Connor opened remarks at the Bibliotheca Alexandrina by discussing frameworks for sustainable development and indicators to monitor progress. He covered topics such as capital stocks, multifactor productivity, intangible assets, and the need for concise indicator sets to track changes in access to resources for current and future generations. O'Connor advocated for overhauling information systems using modern technologies through public-private partnerships to support sustainable development goals.
IABIN Threat Assessment Project Presentation (Costa Rica) by Andy Jarvis from...Hector
The document discusses improving biodiversity data quality for South America. It describes assessing occurrence records from three databases to identify reliable coordinates, develop scripts for automated data cleaning, and georeference additional records. Approximately 19,000 species from 3,900 genera were modeled to analyze threats from accessibility, deforestation, and fires. Conservation status was evaluated by calculating protected areas within species ranges. A web-based tool to visualize the results is under development.
Approaches and Techniques for Managing Human-Elephant Conflicts in Western Se...Isaac Yohana Chamba
A research proposal for a Research project for completion of Master degree of Science in Ecosystems science and Management of Sokoine University of Agriculture (SUA) for academic years 2016-2018. The research tries to find and come up with a new thinking in the management of Human-elephant conflicts for better and sustainable management of socio-ecological systems in Ikorongo-Grumeti Game Reserves, other protected areas within Tanzania and outside the country having similar problems. The project is funded by Singita Grumeti Fund (SGF) - 2017.
La statistique et le machine learning pour l'intégration de données de la bio...tuxette
This document summarizes a presentation on using statistics and machine learning for integrating high-throughput biological data. It discusses how biological data is large in volume, multi-scaled and heterogeneous in type, creating bottlenecks for analysis. It presents different methods for integrating multiple data tables, including multiple kernel learning to combine similarity matrices. An example application to TARA Oceans data is described, identifying Rhizaria abundance as structuring ocean differences. Interpretability of results is discussed along with prospects for deep learning and predicting phenotypes while understanding relationships.
The document discusses the opportunities for open data in agriculture and nutrition from increased data availability due to advances in life sciences, information technologies, and data analytics. It outlines the need for an infrastructure to support FAIR (findable, accessible, interoperable, reusable) data principles and addresses issues like data rights, infrastructure, interoperability, and gaps. Examples are given of commercial field data and projects in Europe that integrate data from various sources using shared semantics, standards, and services. The presentation encourages partnerships to advance open data goals through working groups addressing specific challenges.
This document provides an introduction to data mining concepts and techniques. It discusses why data mining has become important due to the massive growth of data from various sources. Data mining involves knowledge discovery from large datasets using techniques from machine learning, statistics, pattern recognition and databases. The document outlines common data mining tasks like classification, regression, clustering and discusses applications in domains like fraud detection, customer churn prediction, and sky survey cataloging.
Connecting USDA and NSF Terrestrial Observation Network to Science PolicyBrian Wee
Large-scale environmental changes pose challenges that straddle environmental, economic, and social boundaries. As we design and implement climate adaptation strategies at the Federal, state, local, and tribal levels, accessible and usable data are essential for implementing actions that are informed by the best available information. Data-intensive science has been heralded as an enabler for scientific breakthroughs powered by advanced computing capabilities and interoperable data systems. Those same capabilities can be applied to data and information systems that facilitate the transformation of data into highly processed products.
At the interface of scientifically informed public policy and data intensive science lies the potential for producers of credible, integrated, multi-scalar environmental data like the National Ecological Observatory Network (NEON) and its partners to capitalize on data and informatics interoperability initiatives that enable the integration of environmental data from across credible data sources. NEON is designed to provide high-quality, long-term environmental data for research. These data are also meant to be repurposed for operational needs that like risk management, vulnerability assessments, resource management, and others. The proposed USDA Agriculture Research Service (ARS) Long Term Agro-ecosystem Research (LTAR) network is another example of such an environmental observatory that will produce credible data for environmental / agricultural forecasting and informing policy.
To facilitate data fusion across observatories like NEON and LTAR, there is a growing call for observation systems to more closely coordinate and standardize how variables are measured. Together with observation standards, cyberinfrastructure standards enable the proliferation of an ecosystem of applications that utilize diverse, high-quality, credible data. Interoperability facilitates the integration of data from multiple credible sources of data, and enables the repurposing of data for use at different geographical scales. Metadata that captures the transformation of data into value-added products (“provenance”) lends reproducability and transparency to the entire process. This way, the datasets and model code used to create any product can be examined by other parties.
This poster outlines a pathway for transforming environmental data into value-added products by various stakeholders to better inform sustainable agriculture using data from environmental observatories including NEON and LTAR.
Depression Detection in Tweets using Logistic Regression Modelijtsrd
In the growing world of modernization, mental health issues like depression, anxiety and stress are very normal among people and social media like Facebook, Instagram and Twitter have boosted the growth of such mental health. Everything has its legitimacy and negative mark. During this pandemic, people are more likely to suffer from mental health issues, they are available 24 7 and are cut off from the real world. Past examinations have shown that individuals who invest more energy via online media are bound to be depressed. In this project, we find out people who are depressed based on their tweets, followers, following and many other factors. For this, I have trained and tested our text classifier, which will distinguish between the user who is depressed or not depressed. Rahul Kumar Sharma | Vijayakumar A "Depression Detection in Tweets using Logistic Regression Model" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd41284.pdf Paper URL: https://www.ijtsrd.comcomputer-science/data-miining/41284/depression-detection-in-tweets-using-logistic-regression-model/rahul-kumar-sharma
The document discusses computational social science and the interactions-based approach taken by physicists to study collective phenomena emerging from interactions between individuals in complex socio-technological systems. It provides examples of case studies on networks and cooperation, experiments on non-human primates to study hierarchy and cooperation, and efforts to classify human behaviors into phenotypes based on actions in social dilemmas.
Mpict cloud computing and ict workforce 20110106 v8ISSIP
The document discusses emerging trends in information and communication technologies (ICT) and their implications. It notes that ICT is becoming pervasive and networked, with tremendous impact on society, the ICT workforce, and technical education. It argues that demand will increase for local ICT talent with broader skill sets that combine both depth and breadth of knowledge across disciplines and systems.
Cultivation of Crops using Machine Learning and Deep LearningYogeshIJTSRD
To assist you with the entire farming operation, we use cutting edge machine learning and deep learning technologies. Make educated decisions about your areas demographics, the factors that influence your crop, and how to keep them safe for a super awesome good yield. With the rise of big data technology and high performance computing, machine learning has opened up new possibilities for data intensive research in the multi disciplinary agri technologies domain. a Plant disease forecast, b fertilizer recommendation, and c crop recommendation The papers presented have been filtered and classified to show how machine learning technology can support agriculture. Farm management systems are evolving into real time artificial intelligence powered programmes that provide rich suggestions and insights for farmer decision support and action through applying machine learning to sensor data. Ms. A. Benazir Begum | Ajith Manoj | Nithya E | Anamika S S | Sneshna "Cultivation of Crops using Machine Learning and Deep Learning" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-3 , April 2021, URL: https://www.ijtsrd.com/papers/ijtsrd39891.pdf Paper URL: https://www.ijtsrd.com/engineering/computer-engineering/39891/cultivation-of-crops-using-machine-learning-and-deep-learning/ms-a-benazir-begum
large data set is not available for some disease such as Brain Tumor. This and part2 presentation shows how to find "Actionable solution from a difficult cancer dataset
The document summarizes Anita de Waard's presentation on Elsevier's experiments with big and small data. It discusses Elsevier's work with text mining and knowledge graphs to extract information from over 14 million articles. It also describes Elsevier's Medical Graph which predicts the probability of over 2,000 medical conditions occurring based on analysis of clinical data from 6 million patients. Finally, it reviews Elsevier's various tools and services to help researchers preserve, process, share, comprehend, access, and discover research data and publications.
Slides from ICWSM'17 workshop on Social Media for Demographic Research (Montreal, May 2017)
Overview of demography
How can demographers contribute to the analysis of big data (social media)? How can social media contribute to population studies?
Concerns over data quality.
Data Revolution and the SDGs: overview and value, huge challenges for attaining a economic-demographic-
environment balance, and the urgent need for data scientists and demographers to work on these issues.
This document discusses challenges and opportunities for discovering and documenting biodiversity in the current information age. It argues that current taxonomic processes are too slow and that new approaches are needed to integrate distributed data sources and leverage community contributions. Specifically, it proposes:
1) Publishing new biodiversity data prior to formal documentation to accelerate discovery.
2) Developing automated workflows and online workspaces to integrate phylogenetic, distribution, and trait data.
3) Enabling community participation through open data sharing and collaborative annotation platforms.
This document discusses challenges and opportunities for discovering and documenting biodiversity in the current information age. It argues that current taxonomic processes are too slow and that new approaches are needed to integrate distributed data sources and leverage community sourcing. Specifically, it advocates for:
1) Publishing new biodiversity data prior to formal documentation to accelerate discovery.
2) Developing automated workflows and online workspaces to integrate phylogenetic, distribution, and trait data.
3) Enabling community participation in annotating and improving global biodiversity models and maps.
4) Changing incentives to value data sharing over individual "kudos" and prioritize the collective good of the scientific community.
This document provides an overview of machine learning, data mining, and knowledge discovery. It discusses how technological advances have led to an explosion in the amount of data being generated. It then describes several common applications of data mining in business and science. Finally, it outlines some major data mining tasks like classification, clustering, and association rule mining.
This document discusses a study that examines the genetic basis of mouse mandible shape using 3D phenotyping and landmarks. The study aims to validate and improve upon previous QTL mapping studies of mouse mandible shape by applying 3D micro-CT imaging, 3D landmarks, and geometric morphometrics. The study compares results using different landmark configurations, including 2D versus 3D landmarks and manual versus semilandmarks. The study finds that using a large set of semilandmarks coupled with manual landmarks identifies significantly more QTLs and maps them more precisely, suggesting finer phenotypic characterization with 3D landmarks yields better insights into mandibular genetic architecture. However, most variation is still embedded in the natural 2D plane of the
Exposome data challenge - ISGlobal hub prez July 2022.pptxLeaMaitre1
The document summarizes an exposome data challenge event organized by ISGlobal. The event aimed to promote open science and interdisciplinary collaboration around analyzing exposome data. Participants were given a simulated exposome dataset based on real data from the HELIX project and asked to apply their statistical methods to analyze the data. Twenty-five teams were selected to present their approaches at the event. The goal was to accelerate innovation in exposome research through this collaborative data analysis challenge.
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
More Related Content
Similar to Ciência de Dados: definição, desafios de modelagem e aplicações multidisciplinares
Approaches and Techniques for Managing Human-Elephant Conflicts in Western Se...Isaac Yohana Chamba
A research proposal for a Research project for completion of Master degree of Science in Ecosystems science and Management of Sokoine University of Agriculture (SUA) for academic years 2016-2018. The research tries to find and come up with a new thinking in the management of Human-elephant conflicts for better and sustainable management of socio-ecological systems in Ikorongo-Grumeti Game Reserves, other protected areas within Tanzania and outside the country having similar problems. The project is funded by Singita Grumeti Fund (SGF) - 2017.
La statistique et le machine learning pour l'intégration de données de la bio...tuxette
This document summarizes a presentation on using statistics and machine learning for integrating high-throughput biological data. It discusses how biological data is large in volume, multi-scaled and heterogeneous in type, creating bottlenecks for analysis. It presents different methods for integrating multiple data tables, including multiple kernel learning to combine similarity matrices. An example application to TARA Oceans data is described, identifying Rhizaria abundance as structuring ocean differences. Interpretability of results is discussed along with prospects for deep learning and predicting phenotypes while understanding relationships.
The document discusses the opportunities for open data in agriculture and nutrition from increased data availability due to advances in life sciences, information technologies, and data analytics. It outlines the need for an infrastructure to support FAIR (findable, accessible, interoperable, reusable) data principles and addresses issues like data rights, infrastructure, interoperability, and gaps. Examples are given of commercial field data and projects in Europe that integrate data from various sources using shared semantics, standards, and services. The presentation encourages partnerships to advance open data goals through working groups addressing specific challenges.
This document provides an introduction to data mining concepts and techniques. It discusses why data mining has become important due to the massive growth of data from various sources. Data mining involves knowledge discovery from large datasets using techniques from machine learning, statistics, pattern recognition and databases. The document outlines common data mining tasks like classification, regression, clustering and discusses applications in domains like fraud detection, customer churn prediction, and sky survey cataloging.
Connecting USDA and NSF Terrestrial Observation Network to Science PolicyBrian Wee
Large-scale environmental changes pose challenges that straddle environmental, economic, and social boundaries. As we design and implement climate adaptation strategies at the Federal, state, local, and tribal levels, accessible and usable data are essential for implementing actions that are informed by the best available information. Data-intensive science has been heralded as an enabler for scientific breakthroughs powered by advanced computing capabilities and interoperable data systems. Those same capabilities can be applied to data and information systems that facilitate the transformation of data into highly processed products.
At the interface of scientifically informed public policy and data intensive science lies the potential for producers of credible, integrated, multi-scalar environmental data like the National Ecological Observatory Network (NEON) and its partners to capitalize on data and informatics interoperability initiatives that enable the integration of environmental data from across credible data sources. NEON is designed to provide high-quality, long-term environmental data for research. These data are also meant to be repurposed for operational needs that like risk management, vulnerability assessments, resource management, and others. The proposed USDA Agriculture Research Service (ARS) Long Term Agro-ecosystem Research (LTAR) network is another example of such an environmental observatory that will produce credible data for environmental / agricultural forecasting and informing policy.
To facilitate data fusion across observatories like NEON and LTAR, there is a growing call for observation systems to more closely coordinate and standardize how variables are measured. Together with observation standards, cyberinfrastructure standards enable the proliferation of an ecosystem of applications that utilize diverse, high-quality, credible data. Interoperability facilitates the integration of data from multiple credible sources of data, and enables the repurposing of data for use at different geographical scales. Metadata that captures the transformation of data into value-added products (“provenance”) lends reproducability and transparency to the entire process. This way, the datasets and model code used to create any product can be examined by other parties.
This poster outlines a pathway for transforming environmental data into value-added products by various stakeholders to better inform sustainable agriculture using data from environmental observatories including NEON and LTAR.
Depression Detection in Tweets using Logistic Regression Modelijtsrd
In the growing world of modernization, mental health issues like depression, anxiety and stress are very normal among people and social media like Facebook, Instagram and Twitter have boosted the growth of such mental health. Everything has its legitimacy and negative mark. During this pandemic, people are more likely to suffer from mental health issues, they are available 24 7 and are cut off from the real world. Past examinations have shown that individuals who invest more energy via online media are bound to be depressed. In this project, we find out people who are depressed based on their tweets, followers, following and many other factors. For this, I have trained and tested our text classifier, which will distinguish between the user who is depressed or not depressed. Rahul Kumar Sharma | Vijayakumar A "Depression Detection in Tweets using Logistic Regression Model" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd41284.pdf Paper URL: https://www.ijtsrd.comcomputer-science/data-miining/41284/depression-detection-in-tweets-using-logistic-regression-model/rahul-kumar-sharma
The document discusses computational social science and the interactions-based approach taken by physicists to study collective phenomena emerging from interactions between individuals in complex socio-technological systems. It provides examples of case studies on networks and cooperation, experiments on non-human primates to study hierarchy and cooperation, and efforts to classify human behaviors into phenotypes based on actions in social dilemmas.
Mpict cloud computing and ict workforce 20110106 v8ISSIP
The document discusses emerging trends in information and communication technologies (ICT) and their implications. It notes that ICT is becoming pervasive and networked, with tremendous impact on society, the ICT workforce, and technical education. It argues that demand will increase for local ICT talent with broader skill sets that combine both depth and breadth of knowledge across disciplines and systems.
Cultivation of Crops using Machine Learning and Deep LearningYogeshIJTSRD
To assist you with the entire farming operation, we use cutting edge machine learning and deep learning technologies. Make educated decisions about your areas demographics, the factors that influence your crop, and how to keep them safe for a super awesome good yield. With the rise of big data technology and high performance computing, machine learning has opened up new possibilities for data intensive research in the multi disciplinary agri technologies domain. a Plant disease forecast, b fertilizer recommendation, and c crop recommendation The papers presented have been filtered and classified to show how machine learning technology can support agriculture. Farm management systems are evolving into real time artificial intelligence powered programmes that provide rich suggestions and insights for farmer decision support and action through applying machine learning to sensor data. Ms. A. Benazir Begum | Ajith Manoj | Nithya E | Anamika S S | Sneshna "Cultivation of Crops using Machine Learning and Deep Learning" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-3 , April 2021, URL: https://www.ijtsrd.com/papers/ijtsrd39891.pdf Paper URL: https://www.ijtsrd.com/engineering/computer-engineering/39891/cultivation-of-crops-using-machine-learning-and-deep-learning/ms-a-benazir-begum
large data set is not available for some disease such as Brain Tumor. This and part2 presentation shows how to find "Actionable solution from a difficult cancer dataset
The document summarizes Anita de Waard's presentation on Elsevier's experiments with big and small data. It discusses Elsevier's work with text mining and knowledge graphs to extract information from over 14 million articles. It also describes Elsevier's Medical Graph which predicts the probability of over 2,000 medical conditions occurring based on analysis of clinical data from 6 million patients. Finally, it reviews Elsevier's various tools and services to help researchers preserve, process, share, comprehend, access, and discover research data and publications.
Slides from ICWSM'17 workshop on Social Media for Demographic Research (Montreal, May 2017)
Overview of demography
How can demographers contribute to the analysis of big data (social media)? How can social media contribute to population studies?
Concerns over data quality.
Data Revolution and the SDGs: overview and value, huge challenges for attaining a economic-demographic-
environment balance, and the urgent need for data scientists and demographers to work on these issues.
This document discusses challenges and opportunities for discovering and documenting biodiversity in the current information age. It argues that current taxonomic processes are too slow and that new approaches are needed to integrate distributed data sources and leverage community contributions. Specifically, it proposes:
1) Publishing new biodiversity data prior to formal documentation to accelerate discovery.
2) Developing automated workflows and online workspaces to integrate phylogenetic, distribution, and trait data.
3) Enabling community participation through open data sharing and collaborative annotation platforms.
This document discusses challenges and opportunities for discovering and documenting biodiversity in the current information age. It argues that current taxonomic processes are too slow and that new approaches are needed to integrate distributed data sources and leverage community sourcing. Specifically, it advocates for:
1) Publishing new biodiversity data prior to formal documentation to accelerate discovery.
2) Developing automated workflows and online workspaces to integrate phylogenetic, distribution, and trait data.
3) Enabling community participation in annotating and improving global biodiversity models and maps.
4) Changing incentives to value data sharing over individual "kudos" and prioritize the collective good of the scientific community.
This document provides an overview of machine learning, data mining, and knowledge discovery. It discusses how technological advances have led to an explosion in the amount of data being generated. It then describes several common applications of data mining in business and science. Finally, it outlines some major data mining tasks like classification, clustering, and association rule mining.
This document discusses a study that examines the genetic basis of mouse mandible shape using 3D phenotyping and landmarks. The study aims to validate and improve upon previous QTL mapping studies of mouse mandible shape by applying 3D micro-CT imaging, 3D landmarks, and geometric morphometrics. The study compares results using different landmark configurations, including 2D versus 3D landmarks and manual versus semilandmarks. The study finds that using a large set of semilandmarks coupled with manual landmarks identifies significantly more QTLs and maps them more precisely, suggesting finer phenotypic characterization with 3D landmarks yields better insights into mandibular genetic architecture. However, most variation is still embedded in the natural 2D plane of the
Exposome data challenge - ISGlobal hub prez July 2022.pptxLeaMaitre1
The document summarizes an exposome data challenge event organized by ISGlobal. The event aimed to promote open science and interdisciplinary collaboration around analyzing exposome data. Participants were given a simulated exposome dataset based on real data from the HELIX project and asked to apply their statistical methods to analyze the data. Twenty-five teams were selected to present their approaches at the event. The goal was to accelerate innovation in exposome research through this collaborative data analysis challenge.
Similar to Ciência de Dados: definição, desafios de modelagem e aplicações multidisciplinares (20)
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...AbdullaAlAsif1
The pygmy halfbeak Dermogenys colletei, is known for its viviparous nature, this presents an intriguing case of relatively low fecundity, raising questions about potential compensatory reproductive strategies employed by this species. Our study delves into the examination of fecundity and the Gonadosomatic Index (GSI) in the Pygmy Halfbeak, D. colletei (Meisner, 2001), an intriguing viviparous fish indigenous to Sarawak, Borneo. We hypothesize that the Pygmy halfbeak, D. colletei, may exhibit unique reproductive adaptations to offset its low fecundity, thus enhancing its survival and fitness. To address this, we conducted a comprehensive study utilizing 28 mature female specimens of D. colletei, carefully measuring fecundity and GSI to shed light on the reproductive adaptations of this species. Our findings reveal that D. colletei indeed exhibits low fecundity, with a mean of 16.76 ± 2.01, and a mean GSI of 12.83 ± 1.27, providing crucial insights into the reproductive mechanisms at play in this species. These results underscore the existence of unique reproductive strategies in D. colletei, enabling its adaptation and persistence in Borneo's diverse aquatic ecosystems, and call for further ecological research to elucidate these mechanisms. This study lends to a better understanding of viviparous fish in Borneo and contributes to the broader field of aquatic ecology, enhancing our knowledge of species adaptations to unique ecological challenges.
Or: Beyond linear.
Abstract: Equivariant neural networks are neural networks that incorporate symmetries. The nonlinear activation functions in these networks result in interesting nonlinear equivariant maps between simple representations, and motivate the key player of this talk: piecewise linear representation theory.
Disclaimer: No one is perfect, so please mind that there might be mistakes and typos.
dtubbenhauer@gmail.com
Corrected slides: dtubbenhauer.com/talks.html
Authoring a personal GPT for your research and practice: How we created the Q...Leonel Morgado
Thematic analysis in qualitative research is a time-consuming and systematic task, typically done using teams. Team members must ground their activities on common understandings of the major concepts underlying the thematic analysis, and define criteria for its development. However, conceptual misunderstandings, equivocations, and lack of adherence to criteria are challenges to the quality and speed of this process. Given the distributed and uncertain nature of this process, we wondered if the tasks in thematic analysis could be supported by readily available artificial intelligence chatbots. Our early efforts point to potential benefits: not just saving time in the coding process but better adherence to criteria and grounding, by increasing triangulation between humans and artificial intelligence. This tutorial will provide a description and demonstration of the process we followed, as two academic researchers, to develop a custom ChatGPT to assist with qualitative coding in the thematic data analysis process of immersive learning accounts in a survey of the academic literature: QUAL-E Immersive Learning Thematic Analysis Helper. In the hands-on time, participants will try out QUAL-E and develop their ideas for their own qualitative coding ChatGPT. Participants that have the paid ChatGPT Plus subscription can create a draft of their assistants. The organizers will provide course materials and slide deck that participants will be able to utilize to continue development of their custom GPT. The paid subscription to ChatGPT Plus is not required to participate in this workshop, just for trying out personal GPTs during it.
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
The cost of acquiring information by natural selectionCarl Bergstrom
This is a short talk that I gave at the Banff International Research Station workshop on Modeling and Theory in Population Biology. The idea is to try to understand how the burden of natural selection relates to the amount of information that selection puts into the genome.
It's based on the first part of this research paper:
The cost of information acquisition by natural selection
Ryan Seamus McGee, Olivia Kosterlitz, Artem Kaznatcheev, Benjamin Kerr, Carl T. Bergstrom
bioRxiv 2022.07.02.498577; doi: https://doi.org/10.1101/2022.07.02.498577
ESPP presentation to EU Waste Water Network, 4th June 2024 “EU policies driving nutrient removal and recycling
and the revised UWWTD (Urban Waste Water Treatment Directive)”
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
2. Agenda
●
Ciência de Dados – História e Definições
●
Dilúvio de dados, Economia da informação
●
Espectro de estrutura e modelos de dados
– Dados tabulares
– Grafos
– Texto
●
Exemplos de pesquisa multi-modelos e
multi-disciplinares
3. Apresentação
●
Professor de Bancos de Dados/Ciência de
Dados – UTFPR
●
Interesses de Pesquisa: Big Data, NLP, IR,
Redes Complexas, ML, privacidade
●
Formado em Ciência da Computação,
“biólogo frustrado”
5. Mundo de Dados
●
Curtidas em Redes
Sociais
●
Páginas na Web
●
Notas dos alunos
●
Fotos do
Instagram
●
Localização de
Pokémons
●
Sinais de televisão
●
Saldo de contas
correntes
●
Produtos à venda
●
Imagens de satélite
●
Exames médicos
●
Medição de nível de
metano na atmosfera
de Marte
●
Telemetria de um
carro de F1
8. Data Science - History
●
Science has always been Data Science
●
Tycho Brahe (1546-1601)
and Johannes Kepler (1571-
1630) discovered the laws of
planetary motion collecting
and analysing a large volume
of observation data
●
What has changed now:
– The amount of data being generated
– The new methods and tech for analysis
– The dependency of our society on the generated
knowledge
11. Computers
●
Digital production and processing of data
●
DataBase Management Systems (DBMSs)
●
Data Analysis limited to large corporations
12. Internet, cellphones,
sensors, data storage...
●
Fast and cheap
communication for
everyone
●
Massive data production
and consumption
●
Commercial drive for new
data management tech
●
Data-driven economy
●
Data-driven science
14. Dilúvio de Informação
●
1bi usuários conectados no facebook
(23/08/2015)
●
2bi smartphones no mundo, 1b sites web
●
300 horas de vídeo no YouTube a cada
minuto
●
Google, Amazon, Microsoft and Facebook =
1,200 petabytes =
1.200.000.000.000.000.000 bytes = 5
pilhas de CDs até a Estação Espacial
Internacional
15.
16.
17. Big Data
●
Data sets that are so large or complex
that traditional data processing
applications are inadequate
●
Challenges: analysis, capture, data
curation, search, sharing, storage,
transfer, visualization, querying, updating
and information privacy
●
Predictive analytics, user behavior
analytics
18. Information economy
●
“The world’s most valuable resource is no
longer oil, but data”
●
Data companies are the most valuable
listed firms in the world
●
The nature of data makes the antitrust
remedies of the past less useful
The Economist - May 6th 2017
20. The Fourth Paradigm
●
Thousand years ago: science was empirical;
describing natural phenomena
●
Last few hundred years: theoretical branch;
using models, generalizations
●
Last few decades: a computational branch;
simulating complex phenomena
●
Today: data exploration (eScience), unify
theory, experiment, and simulation
(Jim Gray, 2007)
22. Data Science
Data Science, is an interdisciplinary field
about scientific methods, processes and
systems to extract knowledge or insights
from data in various forms [1]
.
Data science is a "concept to unify statistics,
data analysis and their related methods" in
order to "understand and analyze actual
phenomena" with data [2]
.
[1] Dhar, V. (2013). "Data science and 'prediction"
[2] Hayashi, Chikio (1998). "What is Data Science?
23. Data Science vs. Statistics
Dictionary definitions of statistical inference
tend to equate it with the entire discipline. This
has become less satisfactory in the “big data”
era of immense computer-based processing
algorithms. [...]
Very broadly speaking, algorithms are what
statisticians do while inference says why they
do them. A particularly energetic brand of the
statistical enterprise has flourished in the new
century, data science, emphasizing algorithmic
thinking rather than its inferential justification.
26. Tables
Red List of Threatened Species
(International Union for Conservation of Nature)
27. ON THE ECOLOGY OF HUMAN
CARNIVORY
ON THE ECOLOGY OF HUMAN
CARNIVORY
ZULMIRA COIMBRA
ADVISOR: FERNANDO
FERNANDEZ
ZULMIRA COIMBRA
ADVISOR: FERNANDO
FERNANDEZ
43. Text
Major threats to the species include cattle grazing,
agriculture activities and mining activities throughout
its range. A museum specimen collected from Reserva
Forestal de Yotoco in 1996 tested positive for
Batrachochytrium dendrobatidis (Velasquez et al.
2007). The presence of chytrid in this species in 1996 is
consistent with the timing of the declines observed in
the Yotoco subpopulations at the end of the 1990s, as
well as the timing of other Bd declines in montane
Andean species, suggesting it as a plausible, but
unconfirmed cause. However, the species can still be
found within Reserva Forestal de Yotoco (Velasquez et
al. 2007).
44. NLP - Levels of
Representation
Morphology
SyntaxSyntax
Explicit
Semantics
Full Semantics
Words
Also, higher representations require lower
49. Research
●
Analysis of association between in-class
social networks and academic performance
●
Goal: understand how the circle of friends
may influence grades of students
50. Case 1: In-class social networks
and academic performance
class social graph
grades spreadsheet
51. Resultados - Turmas
Média de conexões X
média de nota final
Média de agrupamento X
média de nota de trabalhos
Correlação 0,75 (p = 0,087) Correlação 0,64 (p = 0,167)
56. Student improvement
Students with friends that performed poorly on Exam 1 only improved by 0.5
on average on Exam 2 (p<0.01). Students with friends that performed well, in
contrast, improved by 1.9 points on average -- almost a 4 fold gain when
compared with the other group. This suggests that having friends with good
academic performance have a direct impact on students grade.
57. Resultados - Alunos
●
Correlação significativa entre a nota final
dos alunos e centralidade de autovetor
(correlação: 0,48) e e maior que grau
médio dos vizinhos (correlação: 0,40),
sugerindo importância da topologia da rede
●
Correlação negativa (fraca) entre as notas
centralidade de intermediação
59. Research
●
Analysis networks of entities cited in Fake
News
●
Goal: understand how entities are
mentioned and related in fake news and
how their prevalence correlates with
political events
60. Texto Grafo→ Elks →
...Moro investiga Lula na
operação Lava-Jato...
...Lula se reúne com
Dilma para tratar...
Moro
Lula
Lava-Jato
Dilma
Lula
Moro
LulaLava-Jato
Dilma
63. Identificação de tópicos
●
Algoritmo de agrupamento em grafos
(Modularidade)
●
Agrupamentos representam entidades
frequentemente co-citadas
●
Agrupamentos usados como
representantes de tópicos
68. Trabalhos em andamento
●
Compreender intencionalidade no uso de
metáforas em Fake News
●
Usar texto da tabela da IUCN para fazer
classificação automática das ameaças
●
Comparar evolução dos tópicos das Fake
News com eventos políticos
●
Estudar formação de grupos de alunos
●
Avaliar impacto de espécies em extinção
na rede trófica
69. Projeto Ciência de Dados
por uma Causa
Página: http://dainf.ct.utfpr.edu.br/umacausa
Facebook: https://fb.me/cienciadadoscausa/