This document contains an R programming work sample that includes:
1) Performing partial and robust regression analysis, hypothesis testing, and residual diagnostics on various datasets.
2) Identifying outliers and influential points through plots and tests.
3) Testing hypotheses about regression coefficients and determining if assumptions are met.
4) Exploring transformations to stabilize variance and improve model fit.
Clips basics how to make expert system in clips | facts adding | rules makin...NaumanMalik30
AOA #CS607 k is tutorials ma meny #clips programming ma ES bnana sikhaya
Facebook: https://web.facebook.com/Nauman1
.Here is my #slideshare #link for downloading slides..
Asssignments k lia facebook link per contact krain
umeed hai ki aapko ye video achi lgi.
Please Share, Support, follow , Subscribe!!! or if u Need help me?
Facebook: https://web.facebook.com/Nauman1
Linkedin : https://bit.ly/2DYFgTg
Download #Artificial_intelligence_slides https://bit.ly/2HTb3dD
Subscribe Nauman Malik channel: https://bit.ly/2t1P3Dd
Cs607 #playlist on Youtube: https://bit.ly/2DNUjQM
Instagram: https://www.instagram.com/nauman_mlik/
Google Plus: https://bit.ly/2MSJq3n
BLOGspot https://naumanai.blogspot.com/
About : Nauman Malik is actually a YouTube Channel, where you will find #University
courses videos #Artificial_intelligence #cs607 #robotic technological videos in Urdu_
Hindi, #keep in touch for your Future #needs So don’t forgot to subscribe :)
This document discusses time series analysis techniques in R, including decomposition, forecasting, clustering, and classification. It provides examples of decomposing the AirPassengers dataset, forecasting with ARIMA models, hierarchical clustering on synthetic control chart data using Euclidean and DTW distances, and classifying the control chart data using decision trees with DWT features. Accuracy of over 88% was achieved on the classification task.
Pengolahan Data Panel Logit di Stata: Penilaian Goodness of Fit, Uji Model, d...The1 Uploader
panel data regression in stata with tests of classical assumptions violation
goodness of fit
pooled least square
fixed effect
random effect
logit
panel logit
Programs written in functional programming languages, like Scala and Clojure, are less complex than their Java counterpart. They are easier to reason about once you have passed the initial learning curve for the language. Even though functional programming has become syntactically sane with the introduction of lambda, the functional languages still present competitive features such as tail call optimization, lazy evaluation and persistent data structures. These features can be implemented as Java libraries. You will see how they can radically reduce complexity of Java code, today.
This document provides an outline and overview of dynamic semantics and operational semantics. It discusses defining the meaning of programs through execution and transition systems. It introduces DynSem, a domain-specific language for specifying dynamic semantics in a modular way. DynSem specifications generate interpreters from language definitions. The document uses examples from arithmetic expressions and a language with boxes to illustrate DynSem specifications.
Programming Paradigms Which One Is The Best?Netguru
The document discusses different programming paradigms and which one may be best. It describes object-oriented programming, imperative programming, and declarative programming. For each, it provides examples in code to illustrate the paradigm. It argues that while imperative programming is popular and easy, it can be error-prone and not scale well. Declarative programming is described as simpler, safer, and more scalable by declaring intent rather than implementation. In the end, the document concludes that no single paradigm is best, and that they are often used together in practice.
The document discusses loops in C programming. It provides examples of using for loops to print stars, numbers from 1 to 10, and numbers from 1 to 1000. The for loop syntax of initialization, condition, and increment is explained. Additional examples are given to illustrate using arithmetic expressions and incrementing/decrementing the counter variable. The for loop provides an efficient way to repeat a block of code a fixed number of times without lengthy repetitive code.
This document provides a table of 100 functions with their derivatives worked out. It begins with introductory rules for deriving various types of functions like constants, polynomials, exponentials, logarithmic, trigonometric, and combined functions. Then it lists 100 specific functions and their derivatives to serve as worked examples applying the basic rules.
Clips basics how to make expert system in clips | facts adding | rules makin...NaumanMalik30
AOA #CS607 k is tutorials ma meny #clips programming ma ES bnana sikhaya
Facebook: https://web.facebook.com/Nauman1
.Here is my #slideshare #link for downloading slides..
Asssignments k lia facebook link per contact krain
umeed hai ki aapko ye video achi lgi.
Please Share, Support, follow , Subscribe!!! or if u Need help me?
Facebook: https://web.facebook.com/Nauman1
Linkedin : https://bit.ly/2DYFgTg
Download #Artificial_intelligence_slides https://bit.ly/2HTb3dD
Subscribe Nauman Malik channel: https://bit.ly/2t1P3Dd
Cs607 #playlist on Youtube: https://bit.ly/2DNUjQM
Instagram: https://www.instagram.com/nauman_mlik/
Google Plus: https://bit.ly/2MSJq3n
BLOGspot https://naumanai.blogspot.com/
About : Nauman Malik is actually a YouTube Channel, where you will find #University
courses videos #Artificial_intelligence #cs607 #robotic technological videos in Urdu_
Hindi, #keep in touch for your Future #needs So don’t forgot to subscribe :)
This document discusses time series analysis techniques in R, including decomposition, forecasting, clustering, and classification. It provides examples of decomposing the AirPassengers dataset, forecasting with ARIMA models, hierarchical clustering on synthetic control chart data using Euclidean and DTW distances, and classifying the control chart data using decision trees with DWT features. Accuracy of over 88% was achieved on the classification task.
Pengolahan Data Panel Logit di Stata: Penilaian Goodness of Fit, Uji Model, d...The1 Uploader
panel data regression in stata with tests of classical assumptions violation
goodness of fit
pooled least square
fixed effect
random effect
logit
panel logit
Programs written in functional programming languages, like Scala and Clojure, are less complex than their Java counterpart. They are easier to reason about once you have passed the initial learning curve for the language. Even though functional programming has become syntactically sane with the introduction of lambda, the functional languages still present competitive features such as tail call optimization, lazy evaluation and persistent data structures. These features can be implemented as Java libraries. You will see how they can radically reduce complexity of Java code, today.
This document provides an outline and overview of dynamic semantics and operational semantics. It discusses defining the meaning of programs through execution and transition systems. It introduces DynSem, a domain-specific language for specifying dynamic semantics in a modular way. DynSem specifications generate interpreters from language definitions. The document uses examples from arithmetic expressions and a language with boxes to illustrate DynSem specifications.
Programming Paradigms Which One Is The Best?Netguru
The document discusses different programming paradigms and which one may be best. It describes object-oriented programming, imperative programming, and declarative programming. For each, it provides examples in code to illustrate the paradigm. It argues that while imperative programming is popular and easy, it can be error-prone and not scale well. Declarative programming is described as simpler, safer, and more scalable by declaring intent rather than implementation. In the end, the document concludes that no single paradigm is best, and that they are often used together in practice.
The document discusses loops in C programming. It provides examples of using for loops to print stars, numbers from 1 to 10, and numbers from 1 to 1000. The for loop syntax of initialization, condition, and increment is explained. Additional examples are given to illustrate using arithmetic expressions and incrementing/decrementing the counter variable. The for loop provides an efficient way to repeat a block of code a fixed number of times without lengthy repetitive code.
This document provides a table of 100 functions with their derivatives worked out. It begins with introductory rules for deriving various types of functions like constants, polynomials, exponentials, logarithmic, trigonometric, and combined functions. Then it lists 100 specific functions and their derivatives to serve as worked examples applying the basic rules.
This document discusses binary search trees (BSTs) and their implementations. It covers:
1. Basic BST representations and operations like search, insertion, and iteration using an inorder traversal. Search and insertion have average case performance of O(log n).
2. Randomized BSTs which provide stronger guarantees of O(log n) height and search/insertion by randomly making inserted nodes the root with some probability.
3. Deletion in BSTs which is challenging and discusses approaches like using tombstones or replacing nodes with their successors to preserve the BST properties.
The document discusses time series analysis techniques in R, including decomposition, forecasting, clustering, and classification. It provides an overview of methods such as ARIMA modeling, dynamic time warping, discrete wavelet transforms, and decision trees. Examples are shown applying these techniques to air passenger data and synthetic control chart time series data, including decomposing, forecasting, hierarchical clustering with Euclidean and DTW distances, and classifying with decision trees using DWT features. Accuracy of over 80% is achieved on the classification tasks.
The document discusses SQL pattern matching using regular expressions. It provides an introduction to regular expression concepts and functions in Oracle for pattern matching like REGEXP_LIKE, REGEXP_SUBSTR, etc. It then describes how to go beyond the capabilities of these functions to retrieve related rows using SQL pattern matching with clauses like MATCH_RECOGNIZE, PATTERN, DEFINE, MEASURES and examples like identifying successive login failures and sessionization of clickstream data.
Introduced in Oracle Database 12c, the new MATCH_RECOGNIZE clause allows pattern matching across rows and is often associated with Big Data, complex event processing, etc. Should SQL developers who are not (yet) faced with such tasks ignore it? No way! The new feature is powerful enough to simplify a lot of day-to-day tasks and to solve them in a new, simple and efficient way. The insight into a new syntax is given based on common examples, as finding gaps, merging temporal intervals or grouping on fuzzy criteria. Providing more straightforward approach for solving known problems, the new functionality is worth to be a part of every developer’s toolbox.
The document defines a function called covcor() that calculates and returns the covariance and correlation between variables in a data frame. The function takes a data frame as input, splits it by a grouping variable, applies covariance and correlation calculations to subsets of the data, and combines the results into an output data frame. Three methods for defining the covcor() function are presented: 1) Using subset() and merge(), 2) Using tapply(), and 3) Using ddply() from the plyr package. The function is demonstrated on orange tree data to calculate covariance and correlation between tree age and circumference for each tree. Transforming the circumference variable affects the covariance but not the correlation, demonstrating properties of these statistical measures.
This document discusses term rewriting and provides examples of how rewrite rules can be used to transform terms. Key points include:
- Rewrite rules define pattern matching and substitution to transform terms from a left-hand side to a right-hand side.
- Examples show desugaring language constructs like if-then statements, constant folding arithmetic expressions, and mapping/zipping lists with strategies as parameters to rules.
- Terms can represent programming language syntax and semantics domains. Signatures define the structure of terms.
- Rewriting systems provide a declarative way to define program transformations and semantic definitions through rewrite rules and strategies.
The document discusses the benefits of declarative programming using Scala. It provides examples of implementing algorithms and data structures declaratively in Scala. It also discusses the history and future of Scala, as well as how Scala encourages thinking about programs as transformations rather than changes to memory.
Introduction to Lisp. A survey of lisp's history, current incarnations and advanced features such as list comprehensions, macros and domain-specific-language [DSL] support.
A short list of the most useful R commands
reference: http://www.personality-project.org/r/r.commands.html
R programı ile ilgilenen veya yeni öğrenmeye başlayan herkes için hazırlanmıştır.
Michael Fogus discusses creating a minimal Lisp variant using only 7 core functions and forms. He demonstrates how to build up common Lisp features like lists, conditionals, functions, and recursion using only the primitive functions car, cdr, cons, cond, label, lambda, and dynamic scoping. Through a series of examples, he shows how more advanced features can emerge from these basics alone, culminating in a meta-circular evaluator. He argues the core set could be reduced even further to just 3 primitive forms.
Using R Tool for Probability and Statistics nazlitemu
1. The document describes exercises from a probability and statistics lab report, including generating random vectors, estimating distributions, and assessing hypotheses.
2. For the first exercise, random vectors were generated from uniform, normal, and exponential distributions and their histograms, CDFs, and boxplots were represented. Bin sizes were also calculated.
3. Subsequent exercises involved comparing mean and variance, assessing dependence between random variables, modeling loss event data, and applying the central limit theorem.
Data-Oriented Programming with Clojure and Jackdaw (Charles Reese, Funding Ci...confluent
When Funding Circle needed to scale its lending platform, we chose Kafka and Clojure. More than a programming language, Clojure is an interactive development environment with which you can build up an application function by function in a continuous unbroken flow. Since 2016 we have been developing our lending platform using Clojure and Kafka Streams, and today we process millions of transaction dollars daily. In 2018 we released "Jackdaw", our open-source Clojure library for working with Kafka Streams. In this talk, attendees will learn a radical new approach to building stream processing applications in a highly productive environment--one they can use immediately via Jackdaw or apply to their favorite programming system.
This document provides an overview of essential data wrangling tasks in R, including importing, exploring, indexing/subsetting, reshaping, merging, aggregating, and repeating/looping data. It discusses functions for reading different file types like CSV, Excel, and plain text. It also covers exploring data structure and summary statistics, subsetting vectors, data frames and matrices, reshaping between wide and long format, performing different types of joins to merge data, and using loops and sequences to repeat operations.
Using R in financial modeling provides an introduction to using R for financial applications. It discusses importing stock price data from various sources and visualizing it using basic graphs and technical indicators. It also covers topics like calculating returns, estimating distributions of returns, correlations, volatility modeling, and value at risk calculations. The document provides examples of commands and functions in R to perform these financial analytics tasks on sample stock price data.
Ejercicios de estilo en la programaciónSoftware Guru
El escritor francés Raymond Queneau escribió a mediados del siglo XX un libro llamado "Ejercicios de Estilo" donde mostraba una misma historia corta, redactada de 99 formas distintas.
En esta plática realizaremos el mismo ejercicio con un programa de software. Abarcaremos distintos estilos y paradigmas: programación monolítica, orientada a objetos, relacional, orientada a aspectos, monadas, map-reduce, y muchos otros, a través de los cuales podremos apreciar la riqueza del pensamiento humano aplicado a la computación.
Esto va mucho más allá de un ejercicio académico; el diseño de sistemas de gran escala se alimenta de esta variedad de estilos. También platicaremos sobre los peligros de quedar atrapado bajo un conjunto reducido de estilos a lo largo de tu carrera, y la necesidad de verdaderamente entender distintos estilos al diseñar arquitecturas de sistemas de software.
Semblanza del conferencista:
Crista Lopez es profesora en la Facultad de Ciencias Computacionales de la Universidad de California en Irvine. Su investigación se enfoca en prácticas de ingeniería de software para sistemas de gran escala. Previamente, fue miembro fundador del equipo en Xerox PARC creador del paradigma de programación orientado a aspectos (AOP). Crista es una de las desarrolladoras principales de OpenSimulator, una plataforma open source para crear mundos virtuales 3D. También es fundadora de Encitra, empresa especializada en la utilización de la realidad virtual para proyectos de desarrollo urbano sustentable. @cristalopes
Functional Programming by Examples using Haskellgoncharenko
The document discusses functional programming concepts in Haskell compared to traditional imperative languages like C++. It provides:
1) An example of quicksort implemented in both C++ and Haskell to illustrate the differences in approach and syntax between the two paradigms. The Haskell version is much more concise, using only 5 lines compared to 14 lines in C++.
2) Explanations of key functional programming concepts in Haskell including pure functions, recursion, pattern matching, and higher-order functions like map and fold.
3) Examples and definitions of commonly used Haskell functions and data types to summarize lists, sorting, and traversing elements - highlighting the more declarative style of functional programming.
This document contains 30 programming exercises involving Scheme functions and data structures. The exercises cover topics like defining functions, manipulating lists, recursive functions, structures, and more. For each exercise, the reader is asked to write Scheme code to solve programming problems related to topics like reversing lists, finding elements in lists, defining recursive functions, and representing geometric and other relationships with nested data structures.
The document loads microbiome data and assigns diet types to subjects. It then analyzes the data at different taxonomic levels (phylum, class, etc.) and creates bar plots comparing the relative abundances between diet types. Stacked bar plots are generated showing the mean relative abundances of taxa for each diet type.
This document discusses binary search trees (BSTs) and their implementations. It covers:
1. Basic BST representations and operations like search, insertion, and iteration using an inorder traversal. Search and insertion have average case performance of O(log n).
2. Randomized BSTs which provide stronger guarantees of O(log n) height and search/insertion by randomly making inserted nodes the root with some probability.
3. Deletion in BSTs which is challenging and discusses approaches like using tombstones or replacing nodes with their successors to preserve the BST properties.
The document discusses time series analysis techniques in R, including decomposition, forecasting, clustering, and classification. It provides an overview of methods such as ARIMA modeling, dynamic time warping, discrete wavelet transforms, and decision trees. Examples are shown applying these techniques to air passenger data and synthetic control chart time series data, including decomposing, forecasting, hierarchical clustering with Euclidean and DTW distances, and classifying with decision trees using DWT features. Accuracy of over 80% is achieved on the classification tasks.
The document discusses SQL pattern matching using regular expressions. It provides an introduction to regular expression concepts and functions in Oracle for pattern matching like REGEXP_LIKE, REGEXP_SUBSTR, etc. It then describes how to go beyond the capabilities of these functions to retrieve related rows using SQL pattern matching with clauses like MATCH_RECOGNIZE, PATTERN, DEFINE, MEASURES and examples like identifying successive login failures and sessionization of clickstream data.
Introduced in Oracle Database 12c, the new MATCH_RECOGNIZE clause allows pattern matching across rows and is often associated with Big Data, complex event processing, etc. Should SQL developers who are not (yet) faced with such tasks ignore it? No way! The new feature is powerful enough to simplify a lot of day-to-day tasks and to solve them in a new, simple and efficient way. The insight into a new syntax is given based on common examples, as finding gaps, merging temporal intervals or grouping on fuzzy criteria. Providing more straightforward approach for solving known problems, the new functionality is worth to be a part of every developer’s toolbox.
The document defines a function called covcor() that calculates and returns the covariance and correlation between variables in a data frame. The function takes a data frame as input, splits it by a grouping variable, applies covariance and correlation calculations to subsets of the data, and combines the results into an output data frame. Three methods for defining the covcor() function are presented: 1) Using subset() and merge(), 2) Using tapply(), and 3) Using ddply() from the plyr package. The function is demonstrated on orange tree data to calculate covariance and correlation between tree age and circumference for each tree. Transforming the circumference variable affects the covariance but not the correlation, demonstrating properties of these statistical measures.
This document discusses term rewriting and provides examples of how rewrite rules can be used to transform terms. Key points include:
- Rewrite rules define pattern matching and substitution to transform terms from a left-hand side to a right-hand side.
- Examples show desugaring language constructs like if-then statements, constant folding arithmetic expressions, and mapping/zipping lists with strategies as parameters to rules.
- Terms can represent programming language syntax and semantics domains. Signatures define the structure of terms.
- Rewriting systems provide a declarative way to define program transformations and semantic definitions through rewrite rules and strategies.
The document discusses the benefits of declarative programming using Scala. It provides examples of implementing algorithms and data structures declaratively in Scala. It also discusses the history and future of Scala, as well as how Scala encourages thinking about programs as transformations rather than changes to memory.
Introduction to Lisp. A survey of lisp's history, current incarnations and advanced features such as list comprehensions, macros and domain-specific-language [DSL] support.
A short list of the most useful R commands
reference: http://www.personality-project.org/r/r.commands.html
R programı ile ilgilenen veya yeni öğrenmeye başlayan herkes için hazırlanmıştır.
Michael Fogus discusses creating a minimal Lisp variant using only 7 core functions and forms. He demonstrates how to build up common Lisp features like lists, conditionals, functions, and recursion using only the primitive functions car, cdr, cons, cond, label, lambda, and dynamic scoping. Through a series of examples, he shows how more advanced features can emerge from these basics alone, culminating in a meta-circular evaluator. He argues the core set could be reduced even further to just 3 primitive forms.
Using R Tool for Probability and Statistics nazlitemu
1. The document describes exercises from a probability and statistics lab report, including generating random vectors, estimating distributions, and assessing hypotheses.
2. For the first exercise, random vectors were generated from uniform, normal, and exponential distributions and their histograms, CDFs, and boxplots were represented. Bin sizes were also calculated.
3. Subsequent exercises involved comparing mean and variance, assessing dependence between random variables, modeling loss event data, and applying the central limit theorem.
Data-Oriented Programming with Clojure and Jackdaw (Charles Reese, Funding Ci...confluent
When Funding Circle needed to scale its lending platform, we chose Kafka and Clojure. More than a programming language, Clojure is an interactive development environment with which you can build up an application function by function in a continuous unbroken flow. Since 2016 we have been developing our lending platform using Clojure and Kafka Streams, and today we process millions of transaction dollars daily. In 2018 we released "Jackdaw", our open-source Clojure library for working with Kafka Streams. In this talk, attendees will learn a radical new approach to building stream processing applications in a highly productive environment--one they can use immediately via Jackdaw or apply to their favorite programming system.
This document provides an overview of essential data wrangling tasks in R, including importing, exploring, indexing/subsetting, reshaping, merging, aggregating, and repeating/looping data. It discusses functions for reading different file types like CSV, Excel, and plain text. It also covers exploring data structure and summary statistics, subsetting vectors, data frames and matrices, reshaping between wide and long format, performing different types of joins to merge data, and using loops and sequences to repeat operations.
Using R in financial modeling provides an introduction to using R for financial applications. It discusses importing stock price data from various sources and visualizing it using basic graphs and technical indicators. It also covers topics like calculating returns, estimating distributions of returns, correlations, volatility modeling, and value at risk calculations. The document provides examples of commands and functions in R to perform these financial analytics tasks on sample stock price data.
Ejercicios de estilo en la programaciónSoftware Guru
El escritor francés Raymond Queneau escribió a mediados del siglo XX un libro llamado "Ejercicios de Estilo" donde mostraba una misma historia corta, redactada de 99 formas distintas.
En esta plática realizaremos el mismo ejercicio con un programa de software. Abarcaremos distintos estilos y paradigmas: programación monolítica, orientada a objetos, relacional, orientada a aspectos, monadas, map-reduce, y muchos otros, a través de los cuales podremos apreciar la riqueza del pensamiento humano aplicado a la computación.
Esto va mucho más allá de un ejercicio académico; el diseño de sistemas de gran escala se alimenta de esta variedad de estilos. También platicaremos sobre los peligros de quedar atrapado bajo un conjunto reducido de estilos a lo largo de tu carrera, y la necesidad de verdaderamente entender distintos estilos al diseñar arquitecturas de sistemas de software.
Semblanza del conferencista:
Crista Lopez es profesora en la Facultad de Ciencias Computacionales de la Universidad de California en Irvine. Su investigación se enfoca en prácticas de ingeniería de software para sistemas de gran escala. Previamente, fue miembro fundador del equipo en Xerox PARC creador del paradigma de programación orientado a aspectos (AOP). Crista es una de las desarrolladoras principales de OpenSimulator, una plataforma open source para crear mundos virtuales 3D. También es fundadora de Encitra, empresa especializada en la utilización de la realidad virtual para proyectos de desarrollo urbano sustentable. @cristalopes
Functional Programming by Examples using Haskellgoncharenko
The document discusses functional programming concepts in Haskell compared to traditional imperative languages like C++. It provides:
1) An example of quicksort implemented in both C++ and Haskell to illustrate the differences in approach and syntax between the two paradigms. The Haskell version is much more concise, using only 5 lines compared to 14 lines in C++.
2) Explanations of key functional programming concepts in Haskell including pure functions, recursion, pattern matching, and higher-order functions like map and fold.
3) Examples and definitions of commonly used Haskell functions and data types to summarize lists, sorting, and traversing elements - highlighting the more declarative style of functional programming.
This document contains 30 programming exercises involving Scheme functions and data structures. The exercises cover topics like defining functions, manipulating lists, recursive functions, structures, and more. For each exercise, the reader is asked to write Scheme code to solve programming problems related to topics like reversing lists, finding elements in lists, defining recursive functions, and representing geometric and other relationships with nested data structures.
The document loads microbiome data and assigns diet types to subjects. It then analyzes the data at different taxonomic levels (phylum, class, etc.) and creates bar plots comparing the relative abundances between diet types. Stacked bar plots are generated showing the mean relative abundances of taxa for each diet type.
This document provides a summary of MATLAB fundamentals including:
1. Basics such as defining and changing variables, arithmetic operations, elementary functions, complex numbers, constants, and numerics.
2. Graphics and plotting capabilities including different plot types.
3. Programming methods like functions, relational and logical operations, control structures like if/else statements and loops, and special topics like polynomials, interpolation, differential equations, and optimization.
4. Descriptive statistics, discrete math functions, and random number generation.
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docxcarliotwaycave
INFORMATIVE ESSAY
The purpose of the Informative Essay assignment is to choose a job or task that you know how to do and then write a minimum of 2 full pages, maximum of 3 full pages, Informative Essay teaching the reader how to do that job or task. You will follow the organization techniques explained in Unit 6.
Here are the details:
1. Read the Lecture Notes in Unit 6. You may also find the information in Chapter 10.5 in our text on Process Analysis helpful. The lecture notes will really be the most important to read in writing this assignment. However, here is a link to that chapter that you may look at in addition to the lecture notes:
https://open.lib.umn.edu/writingforsuccess/chapter/10-5-process-analysis/ (Links to an external site.)
2. Choose your topic, that is, the job or task you want to teach. As the notes explain, this should be a job or task that you already know how to do, and it should be something you can do well. At this point, think about your audience (reader). Will your reader need any knowledge or experience to do this job or task, or will you write these instructions for a general reader where no experience is required to perform the job?
3. Plan your outline to organize this essay. Unit 6 notes offer advice on this organization process. Be sure to include an introductory paragraph that has the four main points presented in the lecture notes.
4. Write the essay. It will need to be at least 2 FULL pages long, maximum of 3 full pages long. You will use the MLA formatting that you used in previous essays from Units 3, 4, and 5.
5. Be sure to include a title for your essay.
6. After writing the essay, be sure to take time to read it several times for revision and editing. It would be helpful to have at least one other person proofread it as well before submitting the assignment.
Quiz2
# comments start with #
# to quit q()
# two steps to install any library
#install.packages("rattle")
#library(rattle)
setwd("D:/AJITH/CUMBERLANDS/Ph.D/SEMESTER 3/Data Science & Big Data Analy (ITS-836-51)/RStudio/Week2")
getwd()
x <- 3 # x is a vector of length 1
print(x)
v1 <- c(2,4,6,8,10)
print(v1)
print(v1[3])
v <- c(1:10) #creates a vector of 10 elements numbered 1 through 10. More complicated data
print(v)
print(v[6])
# Import test data
test<-read.csv("CVEs.csv")
test1<-read.csv("CVEs.csv", sep=",")
test2<-read.table("CVEs.csv", sep=",")
write.csv(test2, file="out.csv")
# Write CSV in R
write.table(test1, file = "out1.csv",row.names=TRUE, na="",col.names=TRUE, sep=",")
head(test)
tail(test)
summary(test)
head <- head(test)
tail <- tail(test)
cor(test$X, test$index)
sd(test$index)
var(test$index)
plot(test$index)
hist(test$index)
str(test$index)
quit()
Quiz3
setwd("C:/Users/ialsmadi/Desktop/University_of_Cumberlands/Lectures/Week2/RScripts")
getwd()
# Import test data
data<-read.csv("yearly_sales.csv")
#A 5-number summary is a set of 5 descriptive statistics for summarizing a continuous univariate data set.
#It consists o ...
The apply() function in R can apply functions over margins of arrays or matrices. It avoids explicit loops and applies the given function to each row or column or both. Some key advantages of apply() include avoiding explicit loops, ability to apply various functions like mean, median etc, and ability to apply user-defined functions. Similarly, lapply() and sapply() apply a function over the lists or vectors but lapply() returns a list while sapply() simplifies the output if possible. Functions like tapply() and by() are useful when dealing with categorical variables to apply functions across categories. mapply() applies a function to multiple arguments and is useful for multivariate functions.
1. ## QuinnMcFee Work Sample (Rprogramming)
## thisisa sample of performingpartial/robustregression,Hypothesistestingand
##other residual diagnostics/remedial measures
## on variousdatasets.AnalysiswrittenbutIsuggestrunningthe plotsinR if you'dlike to
## visualize betterwhatIam talkingaboutwhenreferringtooutliers,influential,points,leverage,ect.
library(faraway)
prostate<-prostate
lcavol<-as.numeric(prostate[,'lcavol'])
lpsa<-as.numeric(prostate[,'lpsa'])
plot(lpsa,lcavol)
model<-lm(lpsa~lcavol)
summary(model)
## fittingregressionline:
abline(model)
h<-1/97 +(lcavol-mean(lcavol))^2/sum((lcavol-mean(lcavol))^2)
lpsa.hat<-fitted(model)
SSE<-sum((lpsa-lpsa.hat)^2)
ExtStRes<-(lpsa-lpsa.hat)*((94/(SSE*(1-h)-(lpsa-lpsa.hat)^2)))^(1/2)
plot(lcavol,lpsa,col=ifelse(abs(ExtStRes)>=abs(qt(.975,df=94)),'red','black'),main='Outliers(red)')
## H1: B1-B2=0 andB1-B3=0 ScientistA
## H2:B1-B2=0 and B2-B3=0 ScientistB
2. ##Test: if 0 not contatinedineitherof the CI'sgivenbelow
### we notice thenforthat the onlything(orthe onlydifference) thatcouldmake
### ScientistA fail torejectnull HypothesisandScientistBreject
### wouldbe whenB2.hatand B3.hat are the furthestapartof the estimates
a<-.05
##let
n<-100
B.hat1<- 1
B.hat2<- 4
B.hat3<- -2
MSE<-1/qt(1-a/4,n-3) #for simplicitysake
sB1<-2
sB2<-2
sB3<-2
#s^2(x-y)=s^2(X)+s^2(y)
ci1a<-c(B.hat1-B.hat2-qt(1-a/4,n-3)*MSE*sqrt(sB12^2),B.hat1-B.hat2+qt(1-a/4,n-
3)*MSE*sqrt(sB1^2+sB2^2))
ci2a<-c(B.hat1-B.hat3-qt(1-a/4,n-3)*MSE*sqrt(sB13^2),B.hat1-B.hat3+qt(1-a/4,n-
3)*MSE*sqrt(sB1^2+sB2^2))
ci1b<-c(B.hat1-B.hat2-qt(1-a/4,n-3)*MSE*sqrt(sB12^2),B.hat1-B.hat2+qt(1-a/4,n-
3)*MSE*sqrt(sB1^2+sB2^2))
ci2b<-c(B.hat2-B.hat3-qt(1-a/4,n-3)*MSE*sqrt(sB13^2),B.hat1-B.hat3+qt(1-a/4,n-
3)*MSE*sqrt(sB1^2+sB2^2))
##Return ValuesScientistA:
# > ci1a
# [1] -9.0000000 -0.1715729
# > ci2a
3. # [1] -4.000000 5.828427
#########0 isincludedinfirstconfidence intervalsoshe cannotrejectH1
##Return valuesScientistB:
# > ci1b
# [1] -9.0000000 -0.1715729
# > ci2b
# [1] -1.000000 5.828427
### The firstconfidence interval doesn'tinclude 0so he can rejectH2
###The scientistsA andB shoulduse a bonferri correctionof two.
## A andB make 2 differentcomparisonsinthe Hypothesis(B1=B2U B1=B3, B1=B2 U B2=B3)
## while theyshouldbe testingthe unionof all of these(B1=B2U B1=B3 U B2=B3) if they
###in orderto be consistent.
##3a)
##data table writtentocsv file andreadin
deathcigs<-read.csv("Deathcigs.csv")
X<-deathcigs[,'X']
Y<-deathcigs[,'Y']
deathmodel<-lm(Y~X)
plot(X,Y)
abline(deathmodel)
##we see that GreatBrittain(465,1145) and the US (190,1280) are influential points
##As theyare far away fromthe regressionline
##DeletingUnitedStates:
XnoUS<-X[-c(4)]
5. SSE<-sum((Y-Y.hat)^2)
ExtStRes<-(Y-Y.hat)*((94/(SSE*(1-h)-(Y-Y.hat)^2)))^(1/2)
#usingmedianof e* as cut off that partition
cutoff<-as.numeric(summary(X)['Median'])
I1<-as.numeric(X[which(X<=cutoff)])
I2<-as.numeric(X[which(X>cutoff)])
Y1<-as.numeric(Y[which(X<=cutoff)])
Y1.hat<-fitted(lm(Y1~I1))
Y2<-as.numeric(Y[which(X>cutoff)])
Y2.hat<-fitted(lm(Y2~I2))
e1<-Y1-Y1.hat
e2<-Y2-Y2.hat
e1tild<-median(e1)
e2tild<-median(e2)
d1<-mean(abs(e1-e1tild))
d2<-mean(abs(e2-e2tild))
s1<-sd(abs(e1-e1tild))
s2<-sd(abs(e2-e2tild))
n1<-length(I1)
n2<-length(I2)
s<-sqrt(((n1-1)*s1^2+(n2-1)*s2^2)/(n1+n2-2))
tBF<-(d1-d2)/(s*sqrt(1/n1+1/n2)) ## = -4.95
threshold<-qt(.975,n1+n2-2) ## = 1.98
##since abs(tBF)>threshold
## thissuggestsnonconstantvariance by BF test
6. ## tryinga fewtransformations,Ifoundthe logarithmictransformation
## of the response andpredictortohave a nicerlinearregression(seeplotforbeautiful regression)
var(log(X),log(Y)) ##andas a side note orquantitative checkthe variance ismuchsmaller
## than the orignial disrtribution
plot(log(X),log(Y),main="Logof Distribution(nice!)")
abline(lm(log(Y)~log(X)))
# ## BF test
XL<-log(X)
YL<-log(Y)
cutoff<-as.numeric(summary(XL)['Median'])
I1<-as.numeric(XL[which(XL<=cutoff)])
I2<-as.numeric(XL[which(XL>cutoff)])
Y1<-as.numeric(YL[which(XL<=cutoff)])
Y1.hat<-fitted(lm(Y1~I1))
Y2<-as.numeric(YL[which(XL>cutoff)])
Y2.hat<-fitted(lm(Y2~I2))
e1<-Y1-Y1.hat
e2<-Y2-Y2.hat
e1tild<-median(e1)
e2tild<-median(e2)
d1<-mean(abs(e1-e1tild))
d2<-mean(abs(e2-e2tild))
s1<-sd(abs(e1-e1tild))
s2<-sd(abs(e2-e2tild))
n1<-length(I1)
n2<-length(I2)
s<-sqrt(((n1-1)*s1^2+(n2-1)*s2^2)/(n1+n2-2))
7. tBF<-(d1-d2)/(s*sqrt(1/n1+1/n2)) ## = .33857
threshold<-qt(.975,n1+n2-2) ## = 1.98
## the test statisticissmallerthanthresholdsuggesting
#constantvariance
Xs<-as.numeric(sockeye[,'spawners'])
Yr<-as.numeric(sockeye[,'recruits'])
plot(Xs,Yr,main='SpawnersvsRecruitsRegression')
sockmodel<-lm(Yr~Xs)
abline(sockmodel)
Yr.hat<-fitted(sockmodel)
e<-as.numeric(Yr-Yr.hat)
## plottingresiduevsresponse:
plot(Xs,e)
abline(lm(e~Xs))
##Short cut BF/levene testfortestof variance
library('lawstat')
levene.test(e,group=Xs>=534.5)
##Test Statistic=3.1035 p-value=.08988 So we don'tsatisfythe conditionof Gauss-Markovthat
requires
## constant variance soa leastsquaressimpleregressionwill notrepresentthe dataina meaningful
way
plot(Xs,Yr)
curve(Xs*exp(-Xs))
8. ##taking the logof bothsidesof the rickermodel:
##log(Y)=log(B0)+log(X)-B1X
##log(Y/X)=log(B0)-B1X
##lettingournewY=log(Y/X) we have a new regressionfit
# andsolvingthe systemwe find
#withB0~3 and B1~.001
sat<-sat
Y<-sat$total
X1<-sat$expend
X2<-sat$salary
X3<-sat$ratio
model1<-lm(Y~X1+X2+X3)
#callingmodel1:
## Coefficients:
##(Intercept) X1 X2 X3
## 1069.234 16.469 -8.823 6.330
## Thismodel saysthat increasingexpenditureand/orpupilsperteacher,
## Thensat scoresgo up and increasingteachersalariesmakesthe
9. ## SAT scoresgo down. Note that itdoesn'tmake too muchsense for
##that more pupilsperstudentandlowersalariesforteachers
##wouldincrease the average total SATscore so we must have leftout
## an importantpredictorvariable
X4<-sat$takers
model2<-lm(Y~X1+X2+X3+X4)
###Coefficients:
###(Intercept) X1 X2 X3 X4
### 1045.972 4.463 1.638 -3.624 -2.904
#Nowwe're talking.Thismodel makesmore sense.Now whenwe increase the teachersperpupil
#and the teaher'ssalaryThe SATscores wouldgoup (inthe model).
##H0:B2=0 Ha: B2!=0
## level setalpha,a=.05## usingT=B.hat2/s{B.hat2}
B.hat0<-1069.234
B.hat1<-4.463
B.hat2<-1.638
B.hat3<--3.624
B.hat4<--2.904
K=4
Y.hat<-as.numeric(fitted(model2))
n <-length(X3)
MSE<- 1/(n-2)*sum((Y-Y.hat)^2) ##1002.6
sB.hat2<-sqrt(MSE/sum((X2-mean(X2))^2)) ##.761
10. T<-B.hat2/sB.hat2 ## 2.15
t<-abs(qt(.975,n-2))##2.01
##Because T>|t| we rejectH0
sB.hat3<-sqrt(MSE/sum((X3-mean(X3))^2))
T<-B.hat3/sB.hat3
t<-abs(qt(.975,n-2))
##H0: B1&B2&B3=0 Ha: X1,X2,X3 !=0
## level setalpha,a=.05
##B2=0 isrejectedwe donotneedto checkB3 B1
## H0 now requiresB1&B2&B3=0. B2=0 wasrejectedso
## the newhypothesisisrejected
res<-Y-Y.hat
plot(model2,which=c(1:6))
# fromthe residualsvsfittedplotwe observe the fittedcurve tobe non-linearsuggestingnonconstant
variance
# the variance isnonconstantsothe Gauss-Markovconditionisnotsatisfiedmeaningthe errorisnot
normal
#fromthe residualVSfittedplotwe see that29, 24, and 48 are potential outliers
# fromthe Cook'sdistance we see that44 hasthe greatestinfluence onthe regressionof
# the data so 44 ismost likelyaninfluentialpoint.48alsohas relativelyhighCook'sdistance
# howeverthisdatapointisnotisclearlyan outlierinthe residualsVSleverage plot