R is a programming language and environment for statistical analysis and graphics. It has many built-in statistical and graphical techniques. R can be installed from CRAN and runs on Windows, MacOS, and UNIX systems. The basic R interface is the console, but RStudio provides an integrated development environment. In RStudio, you can write scripts, see outputs and plots, and access help and packages. Packages extend R's functionality through additional functions and data. Common data types in R include numeric, integer, character, factor, and logical. Vectors are the basic data structure, but R also supports matrices, arrays, data frames and lists.
Optimizing Set-Similarity Join and Search with Different Prefix SchemesHPCC Systems
As part of the 2018 HPCC Systems Summit Community Day event:
Up first, Zhe Yu, NC State University briefly discusses his poster, How to Be Rich: A Study of Monsters and Mice of American Industry
Following, Fabian Fier, presents his breakout session in the Documentation & Training Track.
Finding duplicate textual content is crucial for many applications, especially plagiarism detection. When dealing with millions of documents finding duplicate content becomes very time-consuming. Thus it needs scalable and efficient data structures and algorithms that solve this task in seconds rather than hours. In my talk, I present an optimization of a common filter-and-verification set-similarity join and search approach. Filter-and-verification means that we only consider such pairs of objects which share a common word or token in a prefix. Such pairs are potentially similar and are verified in a subsequent step. The candidate set is usually orders of magnitudes smaller than the cross product over an input set. We optimizied this approach by regarding overlaps larger than 1, which reduces the candidate set further and makes the verification faster. On the other hand this requires larger prefixes, which use more memory. Our experiments using HPCC Systems show that we can usually optimize the runtime by choosing an overlap different from the standard overlap 1.
Fabian Fier is a PhD student at the database research group of Johann-Christoph Freytag. He holds a diploma in computer science from Humboldt-Universität. His research interest is similarity search on web-scale data. He uses techniques from textual similarity joins on Big Data and adapts them to similiarity search.
This document provides an introduction to a course on data science and R programming. The course aims to provide an overview of data science and the data science process. It introduces R, including its history and how to install R and RStudio. The first module covers basic R programming concepts such as vectors, matrices, factors, and data frames.
(Kpi summer school 2015) theano tutorial part1Serhii Havrylov
The document is a tutorial introduction to Theano, an open source Python library that allows users to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It introduces key concepts in Theano including symbolic variables, functions, shared variables and updates, gradients, substitution, and random streams. It provides information on where to access more documentation on Theano and sets up the tutorial environment for participants to complete example tasks to learn how to use Theano.
This document provides an introduction to R programming. It discusses that R is an open source programming language for statistical analysis and graphics. It is used widely in data science due to being free, having a strong user community, and having the ability to implement advanced statistical methods. The document then covers downloading and installing R, the basic R environment including the command window and scripts, basic programming objects like vectors and data frames, and how to import and work with datasets in R. It emphasizes that R has powerful but can be difficult to learn due to being command-driven without commercial support.
The document discusses the Lisp programming language. It notes that Allegro Common Lisp will be used and lists textbooks for learning Lisp. It provides 10 points on Lisp, including that it is interactive, dynamic, uses symbols and lists as basic data types, prefix notation for operators, and classifies different data types. Evaluation follows simple rules and programs can be treated as both instructions and data.
R is an open source statistical programming language and software environment used widely for statistical analysis and graphics. This document provided an introduction to using R, including downloading and installing R, the basic R environment and interface, help resources, loading and using packages, reading data into R from files, and performing common descriptive statistics and linear regression modeling. Examples were provided using built-in and example datasets to demonstrate summarizing data, exploring variables, and fitting simple statistical models in R.
Workshop presentation hands on r programmingNimrita Koul
This document provides an overview of the R programming language. It discusses that R is an environment for statistical computing and graphics. It includes conditionals, loops, user defined functions, and input/output facilities. The document describes how to download and install R and RStudio. It also covers key R features such as objects, classes, vectors, matrices, lists, functions, packages, graphics, and input/output.
R is a programming language and environment for statistical analysis and graphics. It has many built-in statistical and graphical techniques. R can be installed from CRAN and runs on Windows, MacOS, and UNIX systems. The basic R interface is the console, but RStudio provides an integrated development environment. In RStudio, you can write scripts, see outputs and plots, and access help and packages. Packages extend R's functionality through additional functions and data. Common data types in R include numeric, integer, character, factor, and logical. Vectors are the basic data structure, but R also supports matrices, arrays, data frames and lists.
Optimizing Set-Similarity Join and Search with Different Prefix SchemesHPCC Systems
As part of the 2018 HPCC Systems Summit Community Day event:
Up first, Zhe Yu, NC State University briefly discusses his poster, How to Be Rich: A Study of Monsters and Mice of American Industry
Following, Fabian Fier, presents his breakout session in the Documentation & Training Track.
Finding duplicate textual content is crucial for many applications, especially plagiarism detection. When dealing with millions of documents finding duplicate content becomes very time-consuming. Thus it needs scalable and efficient data structures and algorithms that solve this task in seconds rather than hours. In my talk, I present an optimization of a common filter-and-verification set-similarity join and search approach. Filter-and-verification means that we only consider such pairs of objects which share a common word or token in a prefix. Such pairs are potentially similar and are verified in a subsequent step. The candidate set is usually orders of magnitudes smaller than the cross product over an input set. We optimizied this approach by regarding overlaps larger than 1, which reduces the candidate set further and makes the verification faster. On the other hand this requires larger prefixes, which use more memory. Our experiments using HPCC Systems show that we can usually optimize the runtime by choosing an overlap different from the standard overlap 1.
Fabian Fier is a PhD student at the database research group of Johann-Christoph Freytag. He holds a diploma in computer science from Humboldt-Universität. His research interest is similarity search on web-scale data. He uses techniques from textual similarity joins on Big Data and adapts them to similiarity search.
This document provides an introduction to a course on data science and R programming. The course aims to provide an overview of data science and the data science process. It introduces R, including its history and how to install R and RStudio. The first module covers basic R programming concepts such as vectors, matrices, factors, and data frames.
(Kpi summer school 2015) theano tutorial part1Serhii Havrylov
The document is a tutorial introduction to Theano, an open source Python library that allows users to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It introduces key concepts in Theano including symbolic variables, functions, shared variables and updates, gradients, substitution, and random streams. It provides information on where to access more documentation on Theano and sets up the tutorial environment for participants to complete example tasks to learn how to use Theano.
This document provides an introduction to R programming. It discusses that R is an open source programming language for statistical analysis and graphics. It is used widely in data science due to being free, having a strong user community, and having the ability to implement advanced statistical methods. The document then covers downloading and installing R, the basic R environment including the command window and scripts, basic programming objects like vectors and data frames, and how to import and work with datasets in R. It emphasizes that R has powerful but can be difficult to learn due to being command-driven without commercial support.
The document discusses the Lisp programming language. It notes that Allegro Common Lisp will be used and lists textbooks for learning Lisp. It provides 10 points on Lisp, including that it is interactive, dynamic, uses symbols and lists as basic data types, prefix notation for operators, and classifies different data types. Evaluation follows simple rules and programs can be treated as both instructions and data.
R is an open source statistical programming language and software environment used widely for statistical analysis and graphics. This document provided an introduction to using R, including downloading and installing R, the basic R environment and interface, help resources, loading and using packages, reading data into R from files, and performing common descriptive statistics and linear regression modeling. Examples were provided using built-in and example datasets to demonstrate summarizing data, exploring variables, and fitting simple statistical models in R.
Workshop presentation hands on r programmingNimrita Koul
This document provides an overview of the R programming language. It discusses that R is an environment for statistical computing and graphics. It includes conditionals, loops, user defined functions, and input/output facilities. The document describes how to download and install R and RStudio. It also covers key R features such as objects, classes, vectors, matrices, lists, functions, packages, graphics, and input/output.
This document provides an overview of Theano, an open source Python library that allows symbolic computation of mathematical expressions and numerical optimization. It introduces key concepts in Theano like symbolic variables, functions, shared variables, and gradients. It also provides information on where to find more documentation on Theano and sets up a tutorial environment to demonstrate scalar math, vector math, matrix math, shared variables, and symbolic differentiation using Theano. The conclusion emphasizes that Theano combines ease of coding with fast execution and is used both in academia and industry, though the symbolic programming paradigm may not be suitable for all users.
The document provides an introduction to the Lisp programming language. It begins with an overview of Lisp and discusses its key features: it is a list processing language where lists are the basic data structure; it is functional in nature; and it uses interpretation rather than compilation. The document then covers Lisp basics like data types, evaluation rules, defining functions, conditional statements, loops, and input/output operations. It also introduces some common Lisp functions and techniques like car, cdr, cons, append, cond, do, dotimes, and dolist.
Lisp is a functional programming language where the basic data structure is linked lists and atoms. It was one of the earliest programming languages developed in 1958. Lisp programs are run by interacting with an interpreter like Clisp. Key aspects of Lisp include its use of prefix notation, treating all code as nested lists, defining functions using defun, and its emphasis on recursion and higher-order functions. Common control structures include cond for conditional evaluation and looping constructs like loop. Lisp fell out of widespread use due to performance issues with interpretation and low interoperability with other languages.
This document provides an overview of key concepts in MATLAB including:
- MATLAB can be used as a powerful calculator or programming language. It has many built-in functions and the ability to define variables and scripts.
- Scripts allow storing and running sequences of MATLAB commands. Variables can be created and manipulated using basic arithmetic, element-wise, and matrix operations.
- Common variable types include numeric arrays and cell arrays. Variables are initialized without declaring type or size. Built-in functions help work with variables.
- Key concepts covered include scripts, variables, vectors, matrices, basic operations, and plotting. Examples are provided to demonstrate MATLAB basics.
This document provides an introduction to the Lisp programming language. It discusses the history of Lisp, which was created in 1958. It also covers key Lisp concepts like S-expressions, atoms, function definition, evaluation, and macros. Macros allow programmers to generate Lisp code from Lisp code, extending the language. The document uses examples to demonstrate Lisp evaluation and features like conditional evaluation, higher-order functions, and special forms like 'quote and 'if.
Lisp was invented in 1958 by John McCarthy and was one of the earliest high-level programming languages. It has a distinctive prefix notation and uses s-expressions to represent code as nested lists. Lisp features include built-in support for lists, dynamic typing, and an interactive development environment. It was closely tied to early AI research and used in systems like SHRDLU. Lisp allows programs to treat code as data through homoiconicity and features like lambdas, conses, and list processing functions make it good for symbolic and functional programming.
CNIT 126 6: Recognizing C Code Constructs in Assembly Sam Bowne
Slides for a college course at City College San Francisco. Based on "Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software", by Michael Sikorski and Andrew Honig; ISBN-10: 1593272901.
Instructor: Sam Bowne
Class website: https://samsclass.info/126/126_S17.shtml
First in the series of slides for python programming, covering topics like programming language, python programming constructs, loops and control statements.
LISP, an acronym for list processing, is a programming language that was designed for easy manipulation of data strings. It is a commonly used language for artificial intelligence (AI) programming.
The document discusses stacks and queues as data structures. It begins by providing an introduction to stacks, describing them as linear data structures that follow the LIFO (last in, first out) principle. It then discusses various stack operations like push, pop, and peep using both array-based and linked implementations. The document also covers topics like multiple stacks, infix/postfix/prefix notation, and algorithms for converting infix to postfix notation and evaluating postfix expressions.
This document provides an overview of the Lisp programming language. It discusses key features of Lisp including its invention in 1958, machine independence, dynamic updating, and wide data types. The document also covers Lisp syntax, data types, variables, constants, operators, decision making, arrays, loops, text editors, and common uses of Lisp like Emacs. Overall, the document serves as a high-level introduction to the concepts and capabilities of the Lisp programming language.
The Compatibility Challenge:Examining R and Developing TERRLou Bajuk
Slides from Michael Sannella, architect for TIBCO Enterprise Runtime for R (TERR), on the the Compatibility Challenge: Examining R and Developing TERR. Presented at useR 2014
Slides for a college course at City College San Francisco. Based on "Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software", by Michael Sikorski and Andrew Honig; ISBN-10: 1593272901.
Instructor: Sam Bowne
Class website: https://samsclass.info/126/126_S17.shtml
The document discusses basic concepts related to exploit development such as vulnerabilities, exploits, fuzzers, memory management, assembly language, and stack-based overflows. It provides definitions and explanations of these key terms, how programs are laid out in memory, basic assembly instructions, register usage, and how to recognize common C language constructs when viewing assembly code.
Practical Malware Analysis: Ch 5: IDA ProSam Bowne
IDA Pro is a disassembler that supports interactive disassembly and analysis of executable files. It has both graph and text modes and uses color-coded arrows to indicate jump instructions. It contains useful windows like Functions, Names, Strings, Imports/Exports, and Cross-References to aid analysis. Functions can be analyzed by examining parameters, calls, and cross references. The disassembly can be enhanced through renaming locations, adding comments, and using named constants. IDA supports plugins for extended functionality.
Practical Malware Analysis: Ch 6: Recognizing C Code Constructs in AssemblySam Bowne
This document discusses techniques for recognizing C constructs in assembly code, including function calls, variables, arithmetic operations, and branching. It explains that function arguments are pushed onto the stack in reverse order before a call instruction launches the function. Global variables are stored in memory and available to all functions, while local variables are stored on the stack and only available within their local function. Arithmetic operations move variables into registers, perform operations like addition and subtraction, and move results back to variables. Branching compares values and uses conditional jump instructions like jz and jnz to follow red or green arrows for false or true outcomes.
LISP Language, LISP Introduction, List Processing, LISP Syntax, Lisp Comparison Structures, Lisp Applications. Using of LISP language in Artificial Intelligence
Practical Malware Analysis: Ch 4 A Crash Course in x86 Disassembly Sam Bowne
This document provides an overview of six levels of abstraction in computing from hardware to interpreted languages. It then focuses on machine code and disassembly, explaining that disassembly converts binary malware into human-readable assembly language. Basic x86 architecture concepts are introduced, including CPU components, memory layout, instructions, registers, and basic operations like arithmetic, branching, and function calls.
CNIT 127 Ch 4: Introduction to format string bugs (rev. 2-9-17)Sam Bowne
Slides for a college course at City College San Francisco. Based on "The Shellcoder's Handbook: Discovering and Exploiting Security Holes ", by Chris Anley, John Heasman, Felix Lindner, Gerardo Richarte; ASIN: B004P5O38Q.
Instructor: Sam Bowne
Class website: https://samsclass.info/127/127_S17.shtml
This document provides an introduction to the R programming language, including that it was developed at Bell Labs, it is a leading tool for statistics and data analysis, and it allows integration with other languages. It also briefly outlines R's installation process, functions, data types including numeric, integer, complex, logical and character, and matrices. Finally, it mentions R has built-in data frames and covers qualitative data.
Functional Python Webinar from October 22nd, 2014Reuven Lerner
Slides from my free functional Python webinar, given on October 22nd, 2014. Discussion included functional programming as a perspective, passing functions as data, and writing programs that take functions as parameters. Includes (at the end) a coupon for my new ebook, Practice Makes Python.
Advanced Data Analytics with R Programming.pptAnshika865276
R is a software environment for statistical analysis and graphics. It allows users to import, clean, analyze, and visualize data. Key features include importing data from various sources, conducting descriptive statistics and statistical modeling, and creating publication-quality graphs. R has a steep learning curve but is highly extensible and supports a wide range of statistical techniques through its base functionality and contributed packages.
This document provides an overview of Theano, an open source Python library that allows symbolic computation of mathematical expressions and numerical optimization. It introduces key concepts in Theano like symbolic variables, functions, shared variables, and gradients. It also provides information on where to find more documentation on Theano and sets up a tutorial environment to demonstrate scalar math, vector math, matrix math, shared variables, and symbolic differentiation using Theano. The conclusion emphasizes that Theano combines ease of coding with fast execution and is used both in academia and industry, though the symbolic programming paradigm may not be suitable for all users.
The document provides an introduction to the Lisp programming language. It begins with an overview of Lisp and discusses its key features: it is a list processing language where lists are the basic data structure; it is functional in nature; and it uses interpretation rather than compilation. The document then covers Lisp basics like data types, evaluation rules, defining functions, conditional statements, loops, and input/output operations. It also introduces some common Lisp functions and techniques like car, cdr, cons, append, cond, do, dotimes, and dolist.
Lisp is a functional programming language where the basic data structure is linked lists and atoms. It was one of the earliest programming languages developed in 1958. Lisp programs are run by interacting with an interpreter like Clisp. Key aspects of Lisp include its use of prefix notation, treating all code as nested lists, defining functions using defun, and its emphasis on recursion and higher-order functions. Common control structures include cond for conditional evaluation and looping constructs like loop. Lisp fell out of widespread use due to performance issues with interpretation and low interoperability with other languages.
This document provides an overview of key concepts in MATLAB including:
- MATLAB can be used as a powerful calculator or programming language. It has many built-in functions and the ability to define variables and scripts.
- Scripts allow storing and running sequences of MATLAB commands. Variables can be created and manipulated using basic arithmetic, element-wise, and matrix operations.
- Common variable types include numeric arrays and cell arrays. Variables are initialized without declaring type or size. Built-in functions help work with variables.
- Key concepts covered include scripts, variables, vectors, matrices, basic operations, and plotting. Examples are provided to demonstrate MATLAB basics.
This document provides an introduction to the Lisp programming language. It discusses the history of Lisp, which was created in 1958. It also covers key Lisp concepts like S-expressions, atoms, function definition, evaluation, and macros. Macros allow programmers to generate Lisp code from Lisp code, extending the language. The document uses examples to demonstrate Lisp evaluation and features like conditional evaluation, higher-order functions, and special forms like 'quote and 'if.
Lisp was invented in 1958 by John McCarthy and was one of the earliest high-level programming languages. It has a distinctive prefix notation and uses s-expressions to represent code as nested lists. Lisp features include built-in support for lists, dynamic typing, and an interactive development environment. It was closely tied to early AI research and used in systems like SHRDLU. Lisp allows programs to treat code as data through homoiconicity and features like lambdas, conses, and list processing functions make it good for symbolic and functional programming.
CNIT 126 6: Recognizing C Code Constructs in Assembly Sam Bowne
Slides for a college course at City College San Francisco. Based on "Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software", by Michael Sikorski and Andrew Honig; ISBN-10: 1593272901.
Instructor: Sam Bowne
Class website: https://samsclass.info/126/126_S17.shtml
First in the series of slides for python programming, covering topics like programming language, python programming constructs, loops and control statements.
LISP, an acronym for list processing, is a programming language that was designed for easy manipulation of data strings. It is a commonly used language for artificial intelligence (AI) programming.
The document discusses stacks and queues as data structures. It begins by providing an introduction to stacks, describing them as linear data structures that follow the LIFO (last in, first out) principle. It then discusses various stack operations like push, pop, and peep using both array-based and linked implementations. The document also covers topics like multiple stacks, infix/postfix/prefix notation, and algorithms for converting infix to postfix notation and evaluating postfix expressions.
This document provides an overview of the Lisp programming language. It discusses key features of Lisp including its invention in 1958, machine independence, dynamic updating, and wide data types. The document also covers Lisp syntax, data types, variables, constants, operators, decision making, arrays, loops, text editors, and common uses of Lisp like Emacs. Overall, the document serves as a high-level introduction to the concepts and capabilities of the Lisp programming language.
The Compatibility Challenge:Examining R and Developing TERRLou Bajuk
Slides from Michael Sannella, architect for TIBCO Enterprise Runtime for R (TERR), on the the Compatibility Challenge: Examining R and Developing TERR. Presented at useR 2014
Slides for a college course at City College San Francisco. Based on "Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software", by Michael Sikorski and Andrew Honig; ISBN-10: 1593272901.
Instructor: Sam Bowne
Class website: https://samsclass.info/126/126_S17.shtml
The document discusses basic concepts related to exploit development such as vulnerabilities, exploits, fuzzers, memory management, assembly language, and stack-based overflows. It provides definitions and explanations of these key terms, how programs are laid out in memory, basic assembly instructions, register usage, and how to recognize common C language constructs when viewing assembly code.
Practical Malware Analysis: Ch 5: IDA ProSam Bowne
IDA Pro is a disassembler that supports interactive disassembly and analysis of executable files. It has both graph and text modes and uses color-coded arrows to indicate jump instructions. It contains useful windows like Functions, Names, Strings, Imports/Exports, and Cross-References to aid analysis. Functions can be analyzed by examining parameters, calls, and cross references. The disassembly can be enhanced through renaming locations, adding comments, and using named constants. IDA supports plugins for extended functionality.
Practical Malware Analysis: Ch 6: Recognizing C Code Constructs in AssemblySam Bowne
This document discusses techniques for recognizing C constructs in assembly code, including function calls, variables, arithmetic operations, and branching. It explains that function arguments are pushed onto the stack in reverse order before a call instruction launches the function. Global variables are stored in memory and available to all functions, while local variables are stored on the stack and only available within their local function. Arithmetic operations move variables into registers, perform operations like addition and subtraction, and move results back to variables. Branching compares values and uses conditional jump instructions like jz and jnz to follow red or green arrows for false or true outcomes.
LISP Language, LISP Introduction, List Processing, LISP Syntax, Lisp Comparison Structures, Lisp Applications. Using of LISP language in Artificial Intelligence
Practical Malware Analysis: Ch 4 A Crash Course in x86 Disassembly Sam Bowne
This document provides an overview of six levels of abstraction in computing from hardware to interpreted languages. It then focuses on machine code and disassembly, explaining that disassembly converts binary malware into human-readable assembly language. Basic x86 architecture concepts are introduced, including CPU components, memory layout, instructions, registers, and basic operations like arithmetic, branching, and function calls.
CNIT 127 Ch 4: Introduction to format string bugs (rev. 2-9-17)Sam Bowne
Slides for a college course at City College San Francisco. Based on "The Shellcoder's Handbook: Discovering and Exploiting Security Holes ", by Chris Anley, John Heasman, Felix Lindner, Gerardo Richarte; ASIN: B004P5O38Q.
Instructor: Sam Bowne
Class website: https://samsclass.info/127/127_S17.shtml
This document provides an introduction to the R programming language, including that it was developed at Bell Labs, it is a leading tool for statistics and data analysis, and it allows integration with other languages. It also briefly outlines R's installation process, functions, data types including numeric, integer, complex, logical and character, and matrices. Finally, it mentions R has built-in data frames and covers qualitative data.
Functional Python Webinar from October 22nd, 2014Reuven Lerner
Slides from my free functional Python webinar, given on October 22nd, 2014. Discussion included functional programming as a perspective, passing functions as data, and writing programs that take functions as parameters. Includes (at the end) a coupon for my new ebook, Practice Makes Python.
Advanced Data Analytics with R Programming.pptAnshika865276
R is a software environment for statistical analysis and graphics. It allows users to import, clean, analyze, and visualize data. Key features include importing data from various sources, conducting descriptive statistics and statistical modeling, and creating publication-quality graphs. R has a steep learning curve but is highly extensible and supports a wide range of statistical techniques through its base functionality and contributed packages.
The document provides an overview of the course curriculum for a Python with AI session. It covers Python basics, pandas for working with datasets, REST APIs and GitHub, data visualization, and a final project. It also reviews key Python concepts like conditionals, loops, lists, dictionaries, modules, and the pandas library for reading CSV files and working with dataframes. Exercises include generating random numbers and working with lists, dictionaries, and dataframes.
Key lecture for the EURO-BASIN Training Workshop on Introduction to Statistical Modelling for Habitat Model Development, 26-28 Oct, AZTI-Tecnalia, Pasaia, Spain (www.euro-basin.eu)
This document provides an overview and introduction to Python programming. It discusses Python basics like variables, data types, operators, conditionals, loops, functions and file handling. It also covers commonly used Python libraries and concepts in data analytics like NumPy, Pandas, Matplotlib and statistics. The document is intended as a whistle-stop tour to cover the most common aspects of Python.
This document provides a high-level summary of an introduction to Python programming course. The summary includes an overview of Python basics like variables, data types, operators, conditionals, loops, functions and file handling. It also discusses commonly used Python libraries and concepts in data analytics like NumPy, Pandas, Matplotlib and statistics.
The document discusses functional programming concepts and provides examples in Python. It defines functional programming, compares it to procedural and object-oriented paradigms, and outlines key concepts like pure functions, recursion, immutable data, and higher-order functions. It also provides examples of map, filter and reduce functions in Python and discusses advantages of the functional style.
This document introduces the R programming language. It covers obtaining and installing R, reading and exporting data, and performing basic statistical analyses and econometrics. R can be used for statistical analysis, modeling, and data visualization. It has a steep learning curve but is free, open source software with a strong user community and implements many advanced statistical methods.
This document is a report on Python for a class. It includes sections on the history of Python, why it is a good choice for learning programming, its core characteristics like being interpreted and object-oriented, common data structures like lists and dictionaries, the NumPy package for scientific computing, and a conclusion about the benefits of using Python as a teaching language.
This document is a report on Python for a class. It includes sections on the history of Python, why it is a good choice for learning programming, its core characteristics like being interpreted and object-oriented, common data structures like lists and dictionaries, the NumPy package for scientific computing, and a conclusion about the benefits of using Python as a teaching language.
R is a free software environment for statistical analysis and graphics. It allows importing, cleaning, analyzing, and visualizing data. Key features include its ability to handle many types of data, produce high-quality graphs, and implement a wide variety of statistical techniques like regression. R has a steep learning curve but a strong user community and implements advanced statistical methods. It can effectively store, manipulate, and summarize data.
Slides on introduction to R by ArinBasu MDSonaCharles2
R is a free software environment for statistical analysis and graphics. It allows importing, cleaning, analyzing, and visualizing data. Key features include its ability to read various data formats, perform statistical analyses and modeling, and produce publication-quality graphs. R has a steep learning curve but is highly extensible and supports a wide range of statistical techniques through its packages. This document provides an introduction to obtaining and installing R, performing basic tasks like importing data and help functions, and using R for descriptive statistics, statistical modeling, and multivariate analyses.
R is a free software environment for statistical analysis and graphics. It allows importing, cleaning, analyzing, and visualizing data. Key features include its ability to read various data formats, perform statistical analyses and modeling, and produce publication-quality graphs. R has a steep learning curve but is highly extensible and supports a wide range of statistical techniques through its packages. This document provides an introduction to obtaining and installing R, performing basic tasks like importing data and help functions, and using R for descriptive statistics, statistical modeling, and multivariate analyses.
Introduction to R for Learning Analytics ResearchersVitomir Kovanovic
The slides from my 2hr tutorial organised at 2018 Learning Analytics Summer Institute (LASI) at Teachers College, Columbia University on June 11, 2018.
A MAC URISA event. This talk is oriented to GIS users looking to learn more about the Python programming language. The Python language is incorporated into many GIS applications. Python also has a considerable installation base, with many freely available modules that help developers extend their software to do more.
The beginning third of the talk discusses the history and syntax of the language, along with why a GIS specialist would want to learn how to use the language. The middle of the talk discusses how Python is integrated with the ESRI ArcGIS Desktop suite. The final portion of the talk discusses two Python projects and how they can be used to extend your GIS capabilities and improve efficiency.
Recording of the talk: https://www.youtube.com/watch?v=F1_FqvbXHb4
This document provides an overview of the curriculum for a Python with AI course. The 8 sessions cover Python basics like conditionals, loops, operators and data structures. Sessions also focus on REST APIs, data visualization, connecting multiple AIs, and final projects. Key concepts taught include printing output, taking user input, for/while loops, writing to and appending files, lists, dictionaries, functions, and using external modules like NumPy and Pandas.
Best corporate-r-programming-training-in-mumbaiUnmesh Baile
Vibrant Technologies is headquarted in Mumbai,India.We are the best Teradata training provider in Navi Mumbai who provides Live Projects to students.We provide Corporate Training also.We are Best Teradata Database classes in Mumbai according to our students and corporates
- Python is an interpreted, object-oriented programming language that is beginner friendly and open source. It was created in the 1990s and named after Monty Python.
- Python is very suitable for natural language processing tasks due to its built-in string and list datatypes as well as libraries like NLTK. It also has strong numeric processing capabilities useful for machine learning.
- Python code is organized using functions, classes, modules, and packages to improve structure. It is interpreted at runtime rather than requiring a separate compilation step.
This document provides an overview of the basics of R. It discusses why R is useful, outlines its interface and workspace, describes how to get help and install packages, and explains some key concepts like objects, functions, and the search path. The document is intended to introduce new users to commonly used R functions and features to get started with the programming language.
This document provides an overview of the basics of R. It discusses why R is useful, outlines its interface and workspace, describes how to get help and install packages, and explains some key concepts like objects, functions, and the search path. The document is intended to introduce new users to commonly used R functions and features to get started with the programming language.
Similar to WiNGS 2014 Workshop 2 R, RStudio, and reproducible research with knitr (20)
Use Integrated Genome Browser to explore, analyze, and publish genomic dataAnn Loraine
The document discusses the genome browser IGB (Integrated Genome Browser) and how it can be used to analyze genomic data. IGB allows users to load, visualize, and analyze genomic data. It supports fast zooming and is highly interactive. Data can be shared using QuickLoad sites and IGB is extensible via apps. The document provides an example analysis of the MEOX1 gene using IGB to investigate alternative splicing and its effects on protein function. RNA-seq data was loaded and filtered in IGB to find evidence of exon skipping, which deletes a conserved homeodomain in the protein.
Visualize genomes with Integrated Genome BrowserAnn Loraine
1. The Integrated Genome Browser (IGB) is an open-source desktop application for interactively visualizing genomic data.
2. IGB allows users to load different data types like RNA-Seq reads, zoom and filter views, and share data configurations through Quickload files.
3. IGB is continually improved through an open development process involving testing, code reviews, and an issue tracking system. It serves as an extensible platform for third-party genomic visualization apps.
How to craft a convincing and easy-to-understand self-documenting data analysis report using R Markdown. A guide for undergraduate students taking BINF 3121 Statistics for Bioinformatics at UNC Charlotte.
Giving great talks in Bioinformatics - from Professional Communication class ...Ann Loraine
This slideshow gives advice on how to give effective presentations in science. This was a slidedeck we presented in the first class meeting - where we introduced the class and explained why and how to give good talks. We taught the class twice - in 2014 and 2015 - at UNC Charlotte for their Professional Science Masters program.
Interviewing - why some questions are off limitsAnn Loraine
This document provides guidance on questions that are inappropriate and legally off-limits during a job interview. It notes that questions about age, race, gender, religion, disability, family or marital status are not allowed because they could enable discrimination in hiring decisions. The document advises preparing responses in case inappropriate questions are asked and knowing one's rights under laws like the Civil Rights Act, ADA and ADEA. It suggests politely redirecting to one's qualifications if asked an off-limits question. The document concludes with advising investigation of processes for filing discrimination complaints.
RNA-Seq Analysis of Blueberry Fruit Development and RipeningAnn Loraine
This document summarizes an RNA-Seq analysis of blueberry fruit development and ripening. Researchers sequenced RNA from five stages of fruit development to generate over 20 million reads per sample. Reads were aligned to the blueberry genome assembly to identify over 50,000 expressed genes and their expression profiles across stages. Analysis identified thousands of differentially expressed genes between stages and clusters of genes with similar expression patterns. Pathway analysis revealed metabolic pathways active during fruit development, including a potential new pathway for bixin biosynthesis with high expression during fruit maturation. Resources from the project include an online blueberry browser and gene expression data.
Describes using emulsion PCR to decorate Ion Sphere Particles (beads) with library molecules for Ion Torrent PGM sequencing. This is from a class Ann taught on Genomic Biotechnology.
Visualizing the genome: Techniques for presenting genome data and annotationsAnn Loraine
This document discusses techniques for visualizing genome data and annotations in genome browsers. It describes three key techniques:
1. Semantic zooming allows biologists to inspect both large genomic structures like introns and exons as well as closer details of sequences.
2. Sorting annotations into adjustable and movable tiers helps organize dense information and make patterns more visible.
3. Displaying protein motifs alongside gene structures allows biologists to quickly assess how alternative splicing impacts protein function.
This document provides information about workshops on next-generation science being held at UNC Charlotte in 2014. It details the schedule, locations, instructors, and teaching assistants for Workshop 1 which will cover designing an RNA-Seq experiment, processing and visualizing the resulting data. The workshop will use a real RNA-Seq dataset from tomato pollen undergoing heat stress treatment, with the goal of understanding genes involved in pollen thermotolerance.
IGB genome genometry data models by Gregg Helt and Cyrus HarmonAnn Loraine
These slides were developed by Gregg Helt and Cyrus Harmon to explain the core data models in Integrated Genome Browser. The goal was to make translation between protein, transcript, and genome coordinate systems easier and more powerful. These data models are what makes IGB capable of correctly displaying probes that are split across intron boundaries. They also form the core of the ProtAnnot application, that displays protein domains mapped onto genomic sequence.
RNA-Seq analysis of blueberry fruit identifies candidate genes involved in ri...Ann Loraine
I presented these slides at the Plant Metabolic Network workshop held at the Plant Animal Genome Conference (PAG) XXII, January, 2014. The main goals of the talk were to describe RNA-Seq based annotation of a blueberry genome assembly and explain how we used PlantCyc enzyme data to associate blueberry genes with metabolic pathways.
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
WiNGS 2014 Workshop 2 R, RStudio, and reproducible research with knitr
1. Workshops
in
next-‐genera1on
science
at
UNC
Charlo7e
2014
Workshop
2
-‐
R,
RStudio,
&
reproducible
research
with
knitr
1
2. R,
RStudio,
&
reproducible
research
with
knitr
2
wings
2014
3. No
programming
experience
necessary
"we
wanted
users
to
be
able
to
begin
in
an
interac1ve
environment,
where
they
did
not
consciously
think
of
themselves
as
programming.
Then
as
their
needs
became
clearer
and
their
sophis1ca1on
increased,
they
should
be
able
to
slide
gradually
into
programming..."
John
Chambers,
Stages
in
the
Evolu0on
of
S
3
4. Why
use
R?
• Free
&
open
source
• Has
a
lot
of
support
– Popular
in
many
domains
(finance,
business
analy1cs,
sta1s1cs,
biology)
• Many
libraries
available
for
biological
data
analysis
through
Bioconductor
project
– Such
as
EdgeR
(today)
• Now
has
an
easy
to
use,
free
user
interface
called
RStudio
4
5. RStudio
• A
very
nice
graphical
user
interface
for
R.
• It's
free!
• Integrates
well
with
knitr
– tool
for
wri1ng
sta1s1cal
reports
w/
R
markdown
5
6. R
Markdown
".Rmd"
• Lets
you
write
a
report
that
combines
results
and
commands
• Sounds
weird,
but
once
you
get
used
to
it,
it's
very
powerful
• Catch
mistakes
before
publica1on
– Ask
a
friend
to
run
&
review
your
data
analysis
6
7. knitr
&
R
Markdown
enable
literate
programming
• A
way
to
do
"literate
programming"
– Developed
by
Donald
Knuth,
Stanford
Computer
Science
professor
• Literate
programming:
Write
programs
that
explain
what
they
are
doing
while
they
are
doing
it.
• Prac1cal
applica1on:
Data
Analysis
Reports
7
8. Plan
for
Today
• Introduce
R
and
RStudio
– Part
I:
Func1ons
&
plots
– Part
2:
Markdown
– Part
3:
See
how
sta1s1cal
tes1ng
works
in
R
• Differen1al
expression
analysis
walk-‐through
(may
extend
into
Workshop
3)
• Goal:
Get
you
started!
– Lots
of
Web
resources
for
further
study
8
10. Start
RStudio
• RStudio
has
panes
– w/
min,
max
bu7ons
(top
right)
• Panes
have
tabs
10
console
where
you
type
commands
environment,
shows
variables
you've
defined
11. Make
new
project
(Part
1)
• Select
File
>
Project
>
New
Project
..
• Choose
New
Directory
11
15. • Open
folder
wings2014
• See
wings2014.Rproj
file
• Tip:
Aier
quit,
double-‐click
to
start
RStudio
with
correct
directory
sekngs
15
16. Enter
commands
in
Console
16
>
symbol
is
the
prompt
• Type
commands
or
expressions
at
the
prompt,
ENTER
• R
evaluates
what
you
type,
prints
the
result
• Returns
prompt
17. Prac1ce:
Try
arithme1c
expressions
• Add
+
• Subtract
-‐
• Mul1ply
*
• Raise
to
a
power
**
17
• Expressions
return
values
as
one-‐element
vectors.
• [1]
indicates
that
the
value
next
to
it
has
this
index.
18. Prac1ce:
Save
results
to
variables
18
• Use
'='
to
assign
result
to
a
variable
– Nothing
printed
• Type
variable
name
to
see
what's
in
it
• Use
variables
in
expressions
19. Variables
refer
to
objects
19
• Environment
tab
shows
objects
created
thus
far
• Most
of
what
you
do
in
R
involves
manipula1ng
objects
saved
to
variable
names
– Use
objects
as
inputs
to
func1ons
20. R
func1ons
• R
has
many
func1ons
– math
– plokng
– sta1s1cal
tests
• Func1ons
take
inputs
called
arguments
• Most
func1ons
have
many
possible
arguments
– Usually
have
reasonable
defaults
20
argument
21. How
to
use
a
func1on
in
4
steps
1. Type
func1on
name
2. Type
"("
open
paren
! RStudio
types
closing
paren
for
you
3. Type
arguments
– if
more
than
one
argument,
insert
","
(comma)
4. Type
ENTER
21
sqrt
calculates
square
root
22. Prac1ce:
rnorm
func1on
• rnorm
creates
a
vector
of
numbers
randomly
sampled
from
normal
distribu1on
with
specified
mean,
standard
devia1on
22
func1on
name
rnorm(10,5,5)!
sample
size
mean
standard
devia1on
arguments
23. Prac1ce:
rnorm
func1on
• Mean
and
standard
devia1on
are
op1onal
• If
you
don't
specify
them,
they
default
default
to:
– 0
default
mean
– 1
default
sd
23
24. R
1p!
• Use
UP
arrow
key
to
retrieve
previous
command
– Saves
typing
24
25. Prac1ce:
R
allows
named
arguments
Order
can
vary
25
rnorm(10,mean=5,sd=2)!
26. 26
• Type
help(rnorm)
to
list
arguments,
defaults
• help
is
a
func1on
– takes
other
func1ons
as
arguments
help
shows
how
to
use
a
func1on
27. Now
you
know
how
to...
• Calculate
values
&
see
the
result
• Save
output
to
variables
• Use
Environment
tab
to
view
variables
• Use
R
func1ons
Next
-‐-‐-‐
ploKng!!!
27
28. R
plokng
func1ons
• Many
op1ons
– generic
x-‐y
plot,
sca7er
plots
– barplots
– dendrograms
– histograms
...
and
much
more
• Highly
configurable!
– log
or
linear
scale
axes
– different
characters
or
colors
for
points
...
and
much
more
28
29. Prac1ce:
Generic
x-‐y
plot
(sca7er
plot)
• named
argument
main
determines
plot
1tle
• Note:
Enclose
text
in
quotes
29
30. Prac1ce:
Try
other
op1ons
• col
-‐
color
of
points
(in
quotes)
• pch
-‐
point
character
– numeric
code
– le7er
(in
quotes)
30
and
many
more..
32. Prac1ce:
Adding
to
a
plot
(1)
• abline -‐
"a
b
line"
– add
straight
line
• Arguments:
– v
or
h
for
loca1on
of
ver1cal
or
horizontal
line
– a
and
b
for
slope
and
y
intercept
32
33. Prac1ce:
Adding
to
a
plot
(2)
• points
– add
points
to
a
plot
• Arguments:
– x
,
y
x
&
y
values
for
the
points
– other
op1ons,
same
as
for
plot !
33
34. Take-‐home:
In
R
you
can
"script"
a
plot
• Using
plokng
commands
like
points,
abline,
lines
you
can
add
more
data
to
a
plot,
element
by
element
• Most
plokng
commands
accept
the
same
op1ons,
like
– pch
-‐
point
character
– col
-‐
color
• Learning
one
plokng
command
helps
you
learn
many.
34
37. How
to
install
knitr
• Go
to
Packages
tab
• Not
checked?
– Check
it
• Not
installed?
– Select
Tools
>
Install
Packages...
– Enter
knitr
– Click
Install
• May
need
to
restart
RStudio
37
38. Setup
-‐
to
enable
be7er
coding!
Go
to
Tools
>
Global
Preferences
>
Panes
• Top
right:
console
• Lower
right:
Environment,
History,
Files,
Plots,
Help
• Top
Lei:
Source
• Lower
lei:
everything
else
38
39. Prac1ce:
Make
R
Markdown
file
• Click
"new"
file
icon
• Choose
R
Markdown
– Creates
an
example
R
Markdown
• Take
a
moment
to
scan
document
39
40. R
Markdown
has
plain
text
with
formakng
instruc1ons
• Row
of
"==="
makes
"Title"
a
top
level
heading
40
41. R
Markdown
has
code
chunks
• Code
chunk
-‐
three
back
1cs,
{r},
ends
with
three
more
back
1cs
• gray
background
41
42. knitr
"knits"
code
&
text
• Makes
an
HTML
document
(web
page)
that
combines
– code
– output
from
code
– your
text
explana1ons
42
43. Prac1ce:
Knit
HTML
• Save
the
file
as
"Example.Rmd"
• Click
• Preview
appears
• HTML
file
appears
• Click
Example.html
in
File
tab
– choose
View
in
Web
browser
43
44. knitr
makes
an
HTML
document
(a
Web
page)
• Images
embedded
• You
can
email
it,
save
in
a
Dropbox,
etc
44
51. Sta1s1cal
tests
in
R
• Tests
implemented
as
func1ons
– Usually
return
list
objects
• List
is
– object
that
contains
other
objects
of
many
types
• Previously,
you
saw
vectors
– Output
of
rnorm
command
– Vectors
are
like
lists
that
only
contain
one
type
of
object
(e.g.,
numbers
only)
51
52. Prac1ce:
Start
a
new
sec1on
• Heading,
smaller
than
1tle
heading
52
• Make
new
code
chunk
• Make
new
vectors
• Run
t.test!
53. Tip:
Markdown
help
• Using
R
Markdown
opens
Web
page
w/
more
info
• Markdown
Quick
Reference
shows
Markdown
codes
in
Help
tab
53
54. Prac1ce:
Run
the
code
54
• t.test
output
is
in
result!
• result is
a
list
• Cursor
inside
chunk
• Type
CNTRL-‐ENTER
– or
click
run
58. Goals
• Show
you
how
to
structure
a
data
analysis
– Useful
framework
you
can
use
in
many
sekngs
• Give
you
an
example
differen1al
gene
expression
analysis
for
RNA-‐Seq
– Use
it
as
a
star1ng
point
for
other
projects
–
Tip:
Review
edgeR
user
guide
for
other
example
data
analyses
58
59. Structure
of
the
data
analysis
• Introduc1on
– explain
the
experimental
design
– state
ques1ons
(no
more
than
3,
ideally
2)
• Analysis
– describe
steps
of
analysis,
with
results
– explain
judgment
calls,
like
P
value
cutoffs
• Conclusion
– answer
the
original
ques1ons
• State
limita1ons
of
the
analysis
• Session
info
including
soiware
versions
used
Adapted
from
Jeff
Leek's
Data
Analysis,
Coursera
59
60. Prac1ce:
Setup
• Go
to
h7ps://bitbucket.org/lorainelab/tomatopollen
60
62. Move
to
Desktop
• Subfolders
correspond
to
analysis
chunks
– See
README.md
for
details
• Open
Differen0alExpression
Folder
name
suffix
based
on
repo
version
62
64. Review
of
the
experiment
• Tomato
plants
subjected
to
chronic
mild
heat
stress
&
control
– Greenhouse
C
– Greenhouse
B
• Mature
pollen
grains
harvested
in
batches
over
eight
weeks,
~
10
plants
per
batch
– One
treatment
sample,
one
control
sample
per
collec1on
• RNA
extracted,
sent
to
UCLA
for
sequencing
– 10
libraries,
5
treatments,
5
controls,
69
base
paired
end
sequencing
64
Next:
Step-‐by-‐step
walk-‐through
of
R
Markdown