General statistics, emphasis of statistics with regards to healthcare, types of stats, methods of sampling, errors in sampling, different types of tests, measures of dispersion, correlation, types of correlation
“Statistics is a science of systemic collection, classification, tabulation, presentation, analysis
and interpretation of data.”
It is the science of facts and figures.
Define the terms: Statistics and Biostatistics
Discuss the importance of Biostatistics
Differentiate between Population & Sample, Parameter & Statistics
Identify the various sources of data collection
Explain the types of variables
Explore the different types of Measurement scales
Methods of Presenting the data
Tabular Presentation
Textual Presentation
Graphical Presentation
Statistics
Collection, Classification, Organization, Summarization, analysis, Presentation, and Interpretation of the data / information.
Biostatistics
Collection, Classification, Organization, Summarization, Presentation, and Interpretation of the data / information.
If related to Biological or Health sciences called “Biostatistics”
Why do we need to study Biostatistics course?
To learn how to deal with numbers.
To assess evidence from different studies.
To understand published scientific papers.
To do research and write papers in journals.
Population
The set of all the measurements of interest to the investigator.
Monthly income of households in Pakistan
Number of TB Patients in Pakistan
Sample
It is a group of subjects selected from a population
A random sample is a good representative of population
Example
A survey of 1,000 households taken from all parts of Pakistan to assess their monthly income
Parameter
– The characteristics of interest to the researcher in the population is called a parameter.
E.g. average household size and percent of households with modern sanitation as reported in the 1998 census of Karachi
Statistic
– The characteristics of interest to the researcher in the sub-set of population is called a statistic.
E.g. average household size and percent of households as reported from a sample survey of 6,000 households in Karachi, 2010
Descriptive Statistic :
Consists of the collection, organization, summarization and presentation of data.
Inferential Statistic :
Consists of generalizing from samples to populations, performing estimations and hypothesis tests, determining relationships among variables
A Variable is simply what is being observed or measured
The dependent variable is the outcome of interest
The independent variable is the intervention or what is being manipulated
Data
The set of values collected for the variable of each of the elements belonging to the sample
Qualitative Variable:
Variables that can be placed into distinct categories, according to some characteristic or attribute.
Quantitative variables
That have are measured on a numeric
or quantitative scale. Interval and ratio scales are quantitative
Nominal Scale
- It is the first level of measurement
- Named variables
Ordinal Scale
-Data measured at this level can be placed into categories, and these categories can be ordered, or ranked.
Interval scale:
Differences between values have meaning.
Ordered with proportionate difference between variables
Arbitrary Zero (0 will have a meaning)
Ratio scale:
Differences between values have meaning. Absolute Zero (absence)
Topic: Types of Data
Student Name: Duwa
Class: B.Ed. 2.5
Project Name: “Young Teachers' Professional Development (TPD)"
"Project Founder: Prof. Dr. Amjad Ali Arain
Faculty of Education, University of Sindh, Pakistan
1_Introduction to Biostatistics-2 (2).pdfelphaswalela
Example: For a sample pediatric case, refer to case 7: Toddler with a cough and fever.
Chief concern and history of present illness.
Past history.
Prenatal and birth history.
Developmental history.
Feeding or nutrition history.
Family history.
Social history.
General statistics, emphasis of statistics with regards to healthcare, types of stats, methods of sampling, errors in sampling, different types of tests, measures of dispersion, correlation, types of correlation
“Statistics is a science of systemic collection, classification, tabulation, presentation, analysis
and interpretation of data.”
It is the science of facts and figures.
Define the terms: Statistics and Biostatistics
Discuss the importance of Biostatistics
Differentiate between Population & Sample, Parameter & Statistics
Identify the various sources of data collection
Explain the types of variables
Explore the different types of Measurement scales
Methods of Presenting the data
Tabular Presentation
Textual Presentation
Graphical Presentation
Statistics
Collection, Classification, Organization, Summarization, analysis, Presentation, and Interpretation of the data / information.
Biostatistics
Collection, Classification, Organization, Summarization, Presentation, and Interpretation of the data / information.
If related to Biological or Health sciences called “Biostatistics”
Why do we need to study Biostatistics course?
To learn how to deal with numbers.
To assess evidence from different studies.
To understand published scientific papers.
To do research and write papers in journals.
Population
The set of all the measurements of interest to the investigator.
Monthly income of households in Pakistan
Number of TB Patients in Pakistan
Sample
It is a group of subjects selected from a population
A random sample is a good representative of population
Example
A survey of 1,000 households taken from all parts of Pakistan to assess their monthly income
Parameter
– The characteristics of interest to the researcher in the population is called a parameter.
E.g. average household size and percent of households with modern sanitation as reported in the 1998 census of Karachi
Statistic
– The characteristics of interest to the researcher in the sub-set of population is called a statistic.
E.g. average household size and percent of households as reported from a sample survey of 6,000 households in Karachi, 2010
Descriptive Statistic :
Consists of the collection, organization, summarization and presentation of data.
Inferential Statistic :
Consists of generalizing from samples to populations, performing estimations and hypothesis tests, determining relationships among variables
A Variable is simply what is being observed or measured
The dependent variable is the outcome of interest
The independent variable is the intervention or what is being manipulated
Data
The set of values collected for the variable of each of the elements belonging to the sample
Qualitative Variable:
Variables that can be placed into distinct categories, according to some characteristic or attribute.
Quantitative variables
That have are measured on a numeric
or quantitative scale. Interval and ratio scales are quantitative
Nominal Scale
- It is the first level of measurement
- Named variables
Ordinal Scale
-Data measured at this level can be placed into categories, and these categories can be ordered, or ranked.
Interval scale:
Differences between values have meaning.
Ordered with proportionate difference between variables
Arbitrary Zero (0 will have a meaning)
Ratio scale:
Differences between values have meaning. Absolute Zero (absence)
Topic: Types of Data
Student Name: Duwa
Class: B.Ed. 2.5
Project Name: “Young Teachers' Professional Development (TPD)"
"Project Founder: Prof. Dr. Amjad Ali Arain
Faculty of Education, University of Sindh, Pakistan
1_Introduction to Biostatistics-2 (2).pdfelphaswalela
Example: For a sample pediatric case, refer to case 7: Toddler with a cough and fever.
Chief concern and history of present illness.
Past history.
Prenatal and birth history.
Developmental history.
Feeding or nutrition history.
Family history.
Social history.
In this chapter you learn:
Definition of Statistics & Identify variables in a statistics.
Types of Statistics
Distinguish b/w quantitative & qualitative variables.
Determine the 4 levels of measurement.
Identify populations & samples.
Distinguish different types of Sampling
kelan nyo isubmit yung assignment no. 7 and 8 nyo nasa slides yun ng stats. isubmit nyo sa akin sa lunes during electromagnetism kasi kukulangin yung class participation nyo sa stats.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
2. Statistics is a field of study that means different thing to different
people. To some it indicates mathematics, or arithmetic. to others it
has to do with figures or numbers that may not be easily
comprehensible. There are still a few who perceive statistics as
abstract subject and are scared on hearing the name mentioned.
However statistics is part of us and has application in our daily
activities. For instance how many people reside in your household
especially in polygamous setting? How many are male and how
many are females? How many are educated beyond secondary
school level? In all the observations therefore you are intentionally
or un intentionally applying principles of statistics to some extent.
Introduction
3. The word “Statistic”. Derived form Latin word status,
meaning "manner of standing" or "position“. Simply it
means data or numbers.
Statistics is a branch of knowledge that deals with the
organization and summarization of data and drawing
inferences about large volumes of data after a part of it
is observed.
In its modern setting, statistics refers to the science of
collection, analysis, and interpretation of data.
4. Biostatistics:
The tools of statistics are employed in many fields:
business, education, psychology, agriculture, economics,
… etc.
When the data analyzed are derived from the biological
science, medicine, and health sciences, they are referred
to as Biostatistics
We use the term biostatistics to distinguish this
particular application of statistical tools and concepts.
5. Data
• The raw material of statistics is data.
• Data may define data as figures (numbers). Figures
result from the process of counting or from taking a
measurement.
For example:
• When a hospital administrator counts the number of
patients (counting).
• When a nurse weights a patient (measurement).
6. Sources of data
1. Routinely kept records: Hospital medical records contain
immense amounts of information on patients.
2. External sources: The data already exist in the form of
published reports, commercially available data banks, or
the research literature, i.e. someone else has already
asked the same question.
3. Surveys: A survey may be conducted to obtain
unanswered information.
4. Experiments: Frequently the data needed to answer a
question are available only as the result of an experiment.
7. Variable
• This is derived from variation in living and non-living
things.
• A variable is any characteristic that can and does
assume different values from person to person in a
population or sample of interest.
• For example, demographic variables describe basic
characteristics of human populations, such as gender,
age, ethnicity, marital status, number of children,
education level, employment status, and income.
8. Quantitative Variables
It can be measured by a
numeric amount.
For example:
- the heights of adult males,
- the weights of preschool
children,
- the ages of patients seen in
a dental clinic.
Qualitative Variables
Many characteristics are not
capable of being measured
by a numeric amount.
Some of them can be
ordered or ranked.
For example:
- classification of people into
socio-economic groups,
- social classes based on
income, education, etc.
Types of variables
9. 1. Discrete Variables/ Categorical variables
These are those variables that assume whole numbers such as
0,1,2,3. but not 2.6 or 3.415etc e.g.:
- the number of patients in a hospital
- the number of students in a class
- the number of children in a family
2. Continuous Variables
These are those variables that can assume values other than whole
numbers e.g. the height of an individual:
- The weight of a motor car
- The age of an individual
- The weight of an individual which can be 127 kg, 128.2kg etc.
10. Types of variables
1. Independent Variable: -
Is presumed cause, manipulated by researcher to
observe the effect.
2. Dependent Variable: -
Is the response or outcome the researcher wish to
explain or predict.
11. Measurement: Is the process of assigning numbers to variables.
The four levels of measurement are:
1. Nominal level of measurement: A nominal scale is the lowest
form of measurement because the numbers are simply used as
labels, representing categories or characteristics, and there is no
order to the categories (i.e., no category is higher or lower).
Examples of nominal variables are gender (male, female), religion
(Muslim, Christian), marital status (married, unmarried), and
region of residence (urban, rural).
12. 2. Ordinal level of measurement: the variables are
ordered according to a scale that shows the
relationship between them and their greatness. For
example:
• Anxiety levels of people in a therapy group might be
categorized as mild, moderate, and severe.
• Knowledge of students in a class might be
categorized as high, moderate, and low.
13. 3. Interval level of measurement: The distance between the ranks. For
example, a reading of 37°C might be one category, 37.2°C might be
another category, and 37.4°C might constitute a third category.
there is 0.2°C difference between the first and second category and
third category.
4. Ratio level of measurement: is the lowest form of measurement.
data that can be categorized and ranked; in addition, the distance
between ranks can be specified, and a “true” or natural zero point
can be identified. For example, the number of pain medication
requests made by patients, it would be possible for some patients
to request no pain medications.
14.
15. Population
It is the largest collection of values of a random
variable for which we have an interest at a particular
time.
For example: The weights of all the children enrolled
in a certain elementary school.
Populations may be finite or infinite
A population or collection of entities may however
be animals, machines, plants and cells or even
patients.
16. Populations and Parameters
• Population:
– A group of individuals that we would like to know something
about..
• Parameter:
– A characteristic of the population in which we have a
particular interest
• Often denoted with Greek letters (μ, σ, ρ)
• Examples:
– The proportion of the population that would respond to a certain
drug.
– The association between a risk factor and a disease in a
population.
17. Populations and Samples
• Studying populations is too expensive and time-
consuming, and thus impractical.
• If a sample is representative of the population, then by
observing the sample we can learn something about the
population.
– thus by looking at the characteristics of the sample (statistics),
we may learn something about the characteristics of the
population (parameters).
18.
19. Statistical Analyses
• Two steps
– Descriptive Statistics
• Describe the sample
– Inferential statistics
• Make inferences about the population using what is
observed in the sample.
• Primarily performed in two ways:
– Hypothesis testing
– Estimation