This document describes a versatile table/listing macro called %ANYTL. %ANYTL can produce typical tables and listings using only 3 macro parameters that correspond to the row, column, and title/footnote definitions. It uses a simple language similar to PROC TABULATE to define tables and listings. %ANYTL is flexible in that it allows modifying the summary results before outputting to a table. Examples are provided demonstrating how %ANYTL can quickly produce common table types such as demographic tables, hierarchical tables, and adverse event tables with only a few lines of code.
Any Tl A Versatile Tablelisting Macro Forest Pharmasug 2009snail_chen
The document describes %anyTL, a versatile SAS macro that can create tables and listings from clinical trial data with only 3 parameters. %anyTL allows generating many types of tables with features like automatic titles/footnotes, column widths, text wrapping, and pagination. Examples show how %anyTL can produce demographic tables, shift tables, AE tables, and more from clinical trial datasets.
The document discusses statistical analysis and Excel functions for organizing and summarizing data. It provides information on entering data into Excel sheets, describes functions like SUM, AVERAGE, COUNT, and STDEV that calculate values like sums, means, numbers of data points, and standard deviations. It also discusses using Excel to group data using functions like FREQUENCY and analyzing descriptive statistics using the Data Analysis ToolPak.
We would like to introduce sampling software which costs just 10 USD. Sampling is statistical software designed to calculate sampling computation easily such as stratified sampling, cluster sampling, sampling with varying probability and etc. You can download free 7 times running trial license here:
http://www.sampling-software.com
Avoiding undesired choices using intelligent adaptive systemsijaia
This document summarizes a research paper that proposes methods for avoiding undesirable choices using intelligent adaptive systems. It discusses how preferences can reverse when additional options are introduced, due to dependencies between items. A model is presented using matrices to represent utilities of items in a choice set. The model is generalized to account for multiple items. Methods are suggested for predicting when preference reversals may occur and avoiding undesired choices, such as transparently communicating the context of choices and adaptively generating choice sets.
This document provides instructions on using Excel functions and charts. It describes Excel components and arithmetic operators. It explains the order of precedence for calculations and how to use the Insert Function button to select functions. Examples show how to define functions within functions, examine the Insert Function dialog box, and create column and pie charts using the Chart Wizard. The Chart Wizard dialog boxes guide the user in selecting a chart type, choosing data series, and customizing the chart.
This document provides an overview of MS Excel and its various functions and capabilities. It discusses that MS Excel allows users to store and manipulate data in a tabular format using rows and columns. Each intersection of a row and column is called a cell that can contain text, numbers, or formulas. It also summarizes some of the key functions and tools available in MS Excel including formatting data, importing data, creating tables and charts, sorting data, protecting sheets and workbooks, and using functions like SUM, AVERAGE, IF, CORREL, and the data analysis tools.
This document provides guidance on using set analysis in QlikView. It discusses the basic components of a set - the identifier, operators, and modifiers. Modifiers allow you to filter dimension members in a set, and examples are provided for filtering by all members, known members, search strings, boundaries, variables, functions, and more. Mixing sets and alternate states is also covered. The goal is to clarify the syntax of set analysis in QlikView.
This document provides an overview of how to use Microsoft Excel 2007. It discusses key Excel concepts like workbooks, worksheets, cells, rows, columns, formulas, and functions. It also provides instructions for common Excel tasks like navigating cells, entering and editing data, using autofill, sorting and filtering data, creating formulas, charts, and pivot tables. The document is intended to help new Excel users learn the basics of the program.
Any Tl A Versatile Tablelisting Macro Forest Pharmasug 2009snail_chen
The document describes %anyTL, a versatile SAS macro that can create tables and listings from clinical trial data with only 3 parameters. %anyTL allows generating many types of tables with features like automatic titles/footnotes, column widths, text wrapping, and pagination. Examples show how %anyTL can produce demographic tables, shift tables, AE tables, and more from clinical trial datasets.
The document discusses statistical analysis and Excel functions for organizing and summarizing data. It provides information on entering data into Excel sheets, describes functions like SUM, AVERAGE, COUNT, and STDEV that calculate values like sums, means, numbers of data points, and standard deviations. It also discusses using Excel to group data using functions like FREQUENCY and analyzing descriptive statistics using the Data Analysis ToolPak.
We would like to introduce sampling software which costs just 10 USD. Sampling is statistical software designed to calculate sampling computation easily such as stratified sampling, cluster sampling, sampling with varying probability and etc. You can download free 7 times running trial license here:
http://www.sampling-software.com
Avoiding undesired choices using intelligent adaptive systemsijaia
This document summarizes a research paper that proposes methods for avoiding undesirable choices using intelligent adaptive systems. It discusses how preferences can reverse when additional options are introduced, due to dependencies between items. A model is presented using matrices to represent utilities of items in a choice set. The model is generalized to account for multiple items. Methods are suggested for predicting when preference reversals may occur and avoiding undesired choices, such as transparently communicating the context of choices and adaptively generating choice sets.
This document provides instructions on using Excel functions and charts. It describes Excel components and arithmetic operators. It explains the order of precedence for calculations and how to use the Insert Function button to select functions. Examples show how to define functions within functions, examine the Insert Function dialog box, and create column and pie charts using the Chart Wizard. The Chart Wizard dialog boxes guide the user in selecting a chart type, choosing data series, and customizing the chart.
This document provides an overview of MS Excel and its various functions and capabilities. It discusses that MS Excel allows users to store and manipulate data in a tabular format using rows and columns. Each intersection of a row and column is called a cell that can contain text, numbers, or formulas. It also summarizes some of the key functions and tools available in MS Excel including formatting data, importing data, creating tables and charts, sorting data, protecting sheets and workbooks, and using functions like SUM, AVERAGE, IF, CORREL, and the data analysis tools.
This document provides guidance on using set analysis in QlikView. It discusses the basic components of a set - the identifier, operators, and modifiers. Modifiers allow you to filter dimension members in a set, and examples are provided for filtering by all members, known members, search strings, boundaries, variables, functions, and more. Mixing sets and alternate states is also covered. The goal is to clarify the syntax of set analysis in QlikView.
This document provides an overview of how to use Microsoft Excel 2007. It discusses key Excel concepts like workbooks, worksheets, cells, rows, columns, formulas, and functions. It also provides instructions for common Excel tasks like navigating cells, entering and editing data, using autofill, sorting and filtering data, creating formulas, charts, and pivot tables. The document is intended to help new Excel users learn the basics of the program.
This document provides tips for managing spreadsheets and extracting information from data. It recommends using Google Sheets to collaborate on spreadsheets with others. It also outlines various spreadsheet functions for summarizing data, extracting text, concatenating strings, and looking up values. Conditional formatting is suggested to highlight important information. Pivot tables are presented as a way to summarize tables with filters and aggregations.
This document provides an overview and instructions for using the InnerSoft STATS software to analyze data. It describes 15 different analysis procedures that can be accessed from the software's Analyze Menu, including frequency tables, descriptive statistics, crosstabs, hypothesis tests, ANOVA, correlation, regression, and time series analysis. For each analysis procedure, it provides a brief overview and descriptions of the input options and statistical tests that can be selected. The document is intended to help users understand what types of analyses can be performed and how to set up and interpret the results.
This document provides an overview of forecasting using Eviews 2.0 software. It distinguishes between ex post and ex ante forecasting. Ex post forecasts use known data to evaluate a forecasting model, while ex ante forecasts predict values using uncertain explanatory variables. The document then discusses univariate forecasting methods in Eviews, including trend extrapolation, modeling trend behavior, and analyzing residuals to check assumptions. It provides examples of estimating a trend model, viewing residuals, and making forecasts in Eviews.
The document discusses the key components of Microsoft Excel, including worksheets, cells, formulas, functions, charts, and printing. It describes how to enter and format data, use formulas and functions, navigate between sheets, resize rows and columns, and create basic charts using the Chart Wizard. Key components of the Excel window include the worksheet, formula bar, row and column headings, and sheet tabs. Formulas in Excel always begin with an equal sign and can include arithmetic operators. Functions like SUM can be used to calculate values across ranges of cells.
This lesson covers SAS procedures for reporting and descriptive statistics, including PROC Freq and PROC Means. PROC Freq can produce simple frequency tables and cross-tabulations. PROC Means calculates summary statistics for numeric variables. Both procedures allow options to control output and handle missing data. The lesson also discusses using the output statement and ODS NOPROCTITLE statement.
The document describes how to take stratified random samples from a population using Excel. It discusses stratifying a population of 1,000 credit card holders into three age groups and taking a sample of 100 distributed proportionally among the groups. The sample is generated by sorting each age stratum by random number and selecting the top sample sizes. Summary statistics are then calculated for each stratum.
In September, 2018, we released dynamic array formulas for Excel for Microsoft 365. The differences between dynamic arrays
and legacy Ctrl+Shift+Enter (CSE) formulas are discussed below.
Dynamic array formulas:
Can "spill" outside the cell bounds where the formula is entered. The following example shows the RANDARRAY function in
D1, which spills across D1:F5, or 5 rows by 3 columns. The dynamic array formula tec
This document provides an overview of Excel formulas and functions including MAX, MIN, AVG, IF, and nested IF functions. It includes examples and step-by-step instructions for using these functions to calculate statistics and conditional values. Hands-on exercises guide the user through entering formulas to find averages, maximums, minimums, assign letter grades, and conditionally sum values. The document also introduces more advanced statistical functions and the Analysis ToolPak add-in.
This training document provides an overview and summary of Microsoft Excel topics that will be covered, including: the Excel interface and ribbons, working with cells, formatting text, conditional formatting, inserting rows and columns, editing and fill, sorting, cell referencing, freezing panes, functions, pivot tables, charts, and shortcut keys. The training objectives are to introduce participants to Excel and provide contents on these fundamental Excel functions and features.
The document explains that VLOOKUP is an Excel function that pulls data from one worksheet to another based on a primary key. It defines source and destination spreadsheets as well as primary key. The key points are to identify the primary key column, source table, and column number to pull data from. The steps to use VLOOKUP are to select the lookup value cell, source table range, column index number, and enter FALSE for exact matches before filling the formula down.
VLOOKUP is a useful Excel function that retrieves data from a database or list based on a unique identifier. This document demonstrates how to use VLOOKUP to build an invoice template that automatically populates item descriptions and prices from a product database. It shows entering an item code, writing the VLOOKUP formula to return the corresponding description, and copying the formula down to complete the reusable template.
The document is a presentation about lookup functions in Microsoft Excel. It introduces LOOKUP, VLOOKUP, and HLOOKUP functions. It provides examples of how to use each function to lookup values in tables and return results. It also shares some fun facts about Excel's history and capabilities.
This document provides an overview of formulas and functions in Excel 2003, including MAX, MIN, AVG, IF, and nested IF functions. It explains terminology like formulas, functions, arguments, cell references, and ranges. Hands-on exercises walk through using the AVERAGE, MAX, MIN, IF, and SUMIF functions to calculate statistics and values based on conditional criteria for datasets in Excel worksheets. The document encourages visiting another site for more educational documents and technological information.
MIRCROSOFT EXCEL- brief and useful for beginners by RISHABH BANSALRishabh Bansal
the above presentation gives you a brief explanation of Microsoft excel. it includes various formulas, tips, explanations and shortcut keys that are useful for a beginner.
i found it useful, i hope u will also find it useful.
if you LIKE MY PRESENTATION you could FOLLOW ME on SLIDESHARE and FACEBOOK and add your suggestions for more.
best of luck..
The document provides steps to calculate summary statistics and create plots for different datasets in Excel. For the first dataset of nuclear reactor counts from 104 to 111, it describes calculating the mean, mode, median using the AVERAGE, MODE, and MEDIAN functions directly in cells or using the Insert Function tool. For the second dataset of European auto sales from 11.2 to 14.3, it describes calculating the variance, standard deviation, and range using the VAR, STDEV, and MAX-MIN functions. Finally, it provides steps to generate a boxplot and stem-and-leaf plot for a third dataset ranging from 23 to 51 using the MegaStat add-in.
The document discusses discrete probability distributions. It provides examples of discrete random variables that arise from counting, like the number of fleas on a prairie dog. It also gives the definition of a discrete probability distribution as a table, graph or formula that lists the possible values a discrete random variable can take and their corresponding probabilities. The document calculates the mean and standard deviation for the probability distribution of household size from US Census data, finding the average household has 2.525 people with a standard deviation of 1.422 people. It also gives an example of calculating expected value using a lottery game.
Interactivity on Excel Using Pivoting, Dashboards, “Index and Match,” and Glo...Shalin Hai-Jew
Data are not inert, but interpreting summary data from still data visualizations may be somewhat limited. Excel has various built-in enablements for interactive engagements with data, including pivoting, dashboarding, indexing and matching, and global mapping. This work shows some basic setups of data that can enable more interactivity with little effort.
This document provides instructions on various formatting and functionalities in Excel including: adding borders and formatting numbers; using conditional formatting; adjusting column and row sizes; adding headers and footers; creating multiple line cells; inserting functions like AVERAGE using AutoSum or Insert Function; using range finder to check cell references; applying themes, cell colors, and borders; and switching between value and formula views.
Excel is a spreadsheet program that allows users to view, create, modify, and visualize data. It provides tools like formulas, functions, charts, and conditional formatting to analyze and present data. Excel's interface includes ribbons, a formula bar, and cell selection tools to easily navigate and work with spreadsheets. Users can filter, sort, format and prepare datasets for analysis using features such as autofilter, conditional formatting, number formatting and more.
This document provides a summary of key functions and commands for importing, managing, manipulating, and analyzing data in R. It covers topics such as importing and exporting data, data types, subsetting data, merging datasets, creating and sampling random data, summary statistics, and transforming data. The document is intended as a cheat sheet for common data management tasks in R.
This presentation provides a brief introduction to data types and objects in R. I've not covered 'array' in the presentation, which is a multi-dimensional object [More general than matrix].
This document provides tips for managing spreadsheets and extracting information from data. It recommends using Google Sheets to collaborate on spreadsheets with others. It also outlines various spreadsheet functions for summarizing data, extracting text, concatenating strings, and looking up values. Conditional formatting is suggested to highlight important information. Pivot tables are presented as a way to summarize tables with filters and aggregations.
This document provides an overview and instructions for using the InnerSoft STATS software to analyze data. It describes 15 different analysis procedures that can be accessed from the software's Analyze Menu, including frequency tables, descriptive statistics, crosstabs, hypothesis tests, ANOVA, correlation, regression, and time series analysis. For each analysis procedure, it provides a brief overview and descriptions of the input options and statistical tests that can be selected. The document is intended to help users understand what types of analyses can be performed and how to set up and interpret the results.
This document provides an overview of forecasting using Eviews 2.0 software. It distinguishes between ex post and ex ante forecasting. Ex post forecasts use known data to evaluate a forecasting model, while ex ante forecasts predict values using uncertain explanatory variables. The document then discusses univariate forecasting methods in Eviews, including trend extrapolation, modeling trend behavior, and analyzing residuals to check assumptions. It provides examples of estimating a trend model, viewing residuals, and making forecasts in Eviews.
The document discusses the key components of Microsoft Excel, including worksheets, cells, formulas, functions, charts, and printing. It describes how to enter and format data, use formulas and functions, navigate between sheets, resize rows and columns, and create basic charts using the Chart Wizard. Key components of the Excel window include the worksheet, formula bar, row and column headings, and sheet tabs. Formulas in Excel always begin with an equal sign and can include arithmetic operators. Functions like SUM can be used to calculate values across ranges of cells.
This lesson covers SAS procedures for reporting and descriptive statistics, including PROC Freq and PROC Means. PROC Freq can produce simple frequency tables and cross-tabulations. PROC Means calculates summary statistics for numeric variables. Both procedures allow options to control output and handle missing data. The lesson also discusses using the output statement and ODS NOPROCTITLE statement.
The document describes how to take stratified random samples from a population using Excel. It discusses stratifying a population of 1,000 credit card holders into three age groups and taking a sample of 100 distributed proportionally among the groups. The sample is generated by sorting each age stratum by random number and selecting the top sample sizes. Summary statistics are then calculated for each stratum.
In September, 2018, we released dynamic array formulas for Excel for Microsoft 365. The differences between dynamic arrays
and legacy Ctrl+Shift+Enter (CSE) formulas are discussed below.
Dynamic array formulas:
Can "spill" outside the cell bounds where the formula is entered. The following example shows the RANDARRAY function in
D1, which spills across D1:F5, or 5 rows by 3 columns. The dynamic array formula tec
This document provides an overview of Excel formulas and functions including MAX, MIN, AVG, IF, and nested IF functions. It includes examples and step-by-step instructions for using these functions to calculate statistics and conditional values. Hands-on exercises guide the user through entering formulas to find averages, maximums, minimums, assign letter grades, and conditionally sum values. The document also introduces more advanced statistical functions and the Analysis ToolPak add-in.
This training document provides an overview and summary of Microsoft Excel topics that will be covered, including: the Excel interface and ribbons, working with cells, formatting text, conditional formatting, inserting rows and columns, editing and fill, sorting, cell referencing, freezing panes, functions, pivot tables, charts, and shortcut keys. The training objectives are to introduce participants to Excel and provide contents on these fundamental Excel functions and features.
The document explains that VLOOKUP is an Excel function that pulls data from one worksheet to another based on a primary key. It defines source and destination spreadsheets as well as primary key. The key points are to identify the primary key column, source table, and column number to pull data from. The steps to use VLOOKUP are to select the lookup value cell, source table range, column index number, and enter FALSE for exact matches before filling the formula down.
VLOOKUP is a useful Excel function that retrieves data from a database or list based on a unique identifier. This document demonstrates how to use VLOOKUP to build an invoice template that automatically populates item descriptions and prices from a product database. It shows entering an item code, writing the VLOOKUP formula to return the corresponding description, and copying the formula down to complete the reusable template.
The document is a presentation about lookup functions in Microsoft Excel. It introduces LOOKUP, VLOOKUP, and HLOOKUP functions. It provides examples of how to use each function to lookup values in tables and return results. It also shares some fun facts about Excel's history and capabilities.
This document provides an overview of formulas and functions in Excel 2003, including MAX, MIN, AVG, IF, and nested IF functions. It explains terminology like formulas, functions, arguments, cell references, and ranges. Hands-on exercises walk through using the AVERAGE, MAX, MIN, IF, and SUMIF functions to calculate statistics and values based on conditional criteria for datasets in Excel worksheets. The document encourages visiting another site for more educational documents and technological information.
MIRCROSOFT EXCEL- brief and useful for beginners by RISHABH BANSALRishabh Bansal
the above presentation gives you a brief explanation of Microsoft excel. it includes various formulas, tips, explanations and shortcut keys that are useful for a beginner.
i found it useful, i hope u will also find it useful.
if you LIKE MY PRESENTATION you could FOLLOW ME on SLIDESHARE and FACEBOOK and add your suggestions for more.
best of luck..
The document provides steps to calculate summary statistics and create plots for different datasets in Excel. For the first dataset of nuclear reactor counts from 104 to 111, it describes calculating the mean, mode, median using the AVERAGE, MODE, and MEDIAN functions directly in cells or using the Insert Function tool. For the second dataset of European auto sales from 11.2 to 14.3, it describes calculating the variance, standard deviation, and range using the VAR, STDEV, and MAX-MIN functions. Finally, it provides steps to generate a boxplot and stem-and-leaf plot for a third dataset ranging from 23 to 51 using the MegaStat add-in.
The document discusses discrete probability distributions. It provides examples of discrete random variables that arise from counting, like the number of fleas on a prairie dog. It also gives the definition of a discrete probability distribution as a table, graph or formula that lists the possible values a discrete random variable can take and their corresponding probabilities. The document calculates the mean and standard deviation for the probability distribution of household size from US Census data, finding the average household has 2.525 people with a standard deviation of 1.422 people. It also gives an example of calculating expected value using a lottery game.
Interactivity on Excel Using Pivoting, Dashboards, “Index and Match,” and Glo...Shalin Hai-Jew
Data are not inert, but interpreting summary data from still data visualizations may be somewhat limited. Excel has various built-in enablements for interactive engagements with data, including pivoting, dashboarding, indexing and matching, and global mapping. This work shows some basic setups of data that can enable more interactivity with little effort.
This document provides instructions on various formatting and functionalities in Excel including: adding borders and formatting numbers; using conditional formatting; adjusting column and row sizes; adding headers and footers; creating multiple line cells; inserting functions like AVERAGE using AutoSum or Insert Function; using range finder to check cell references; applying themes, cell colors, and borders; and switching between value and formula views.
Excel is a spreadsheet program that allows users to view, create, modify, and visualize data. It provides tools like formulas, functions, charts, and conditional formatting to analyze and present data. Excel's interface includes ribbons, a formula bar, and cell selection tools to easily navigate and work with spreadsheets. Users can filter, sort, format and prepare datasets for analysis using features such as autofilter, conditional formatting, number formatting and more.
This document provides a summary of key functions and commands for importing, managing, manipulating, and analyzing data in R. It covers topics such as importing and exporting data, data types, subsetting data, merging datasets, creating and sampling random data, summary statistics, and transforming data. The document is intended as a cheat sheet for common data management tasks in R.
This presentation provides a brief introduction to data types and objects in R. I've not covered 'array' in the presentation, which is a multi-dimensional object [More general than matrix].
R can be used to analyze data and perform statistical analysis. Key functions include help() and ? to get information on functions, and objects() to view stored objects. Vectors can be created with c() and manipulated using arithmetic operators. Matrices are two-dimensional arrays that can be operated on using *, /, and t(). Larger datasets are typically read from external files using read.table() or read.delim(). Common distributions can be explored using functions like dnorm(), pnorm(), and rnorm(). Statistical analysis includes commands like cov() and cor() to measure covariance and correlation between variables.
R can be used to analyze data and perform statistical analysis. Functions like help(), ? and help.start() provide information about other functions. Objects created in R sessions are stored by name and can be removed with rm(). Vectors like x=c(1,2,3,4,5) can be created and their length checked with length(x). Subsets of vectors can be selected using logical or integer indexes inside square brackets. Matrices are multi-dimensional generalizations of vectors that can be manipulated using operators like * and %. Data can be read into R from external files using functions like read.table() and read.delim(). Common statistical distributions like normal, uniform and exponential are available as functions in R for
This document provides an overview of statistical concepts and analysis techniques in R, including measures of central tendency, data variability, correlation, regression, and time series analysis. Key points covered include mean, median, mode, variance, standard deviation, z-scores, quartiles, standard deviation vs variance, correlation, ANOVA, and importing/working with different data structures in R like vectors, lists, matrices, and data frames.
This document discusses methods for summarizing data, including frequency distributions, measures of central tendency, and measures of dispersion. It provides examples and formulas for constructing frequency distributions and calculating the mean, median, mode, range, variance, and standard deviation. Key points covered include using frequency distributions to group data, calculating central tendency measures for grouped data, and methods for measuring dispersion both for raw data and grouped data.
The document discusses various statistical concepts including deciles, percentiles, coefficient of variation, five number summary, boxplots, skewness, kurtosis, and statistical software such as Excel and SPSS. Deciles and percentiles refer to the cut points when data is ordered and divided into 10 or 100 equal parts. The coefficient of variation measures the variability in a dataset relative to the mean. The five number summary consists of the minimum, first quartile, median, third quartile, and maximum values of a dataset. Boxplots provide a graphical representation of the five number summary. Skewness and kurtosis measure the asymmetry and peakedness of a distribution. Excel and SPSS are commonly used statistical software packages that allow importing
This document discusses graphing techniques for categorical and compositional data in Stata. It begins with explaining a "stacking trick" for binary responses that involves generating a new variable which vertically stacks points in bars to address overplotting when there are many ties. The document then covers bar charts and related displays for cross-tabulations of categorical data, as well as plots for cumulative distributions of ordinal data. Finally, it explains triangular plots for visualizing three-way compositions where categories sum to 100%.
The document defines data as values of variables that belong to a set of items. It discusses that data is the second most important thing in data science after the question. Having data does not ensure finding answers without a question to guide the analysis. It then provides an overview of topics in R programming for data extraction, exploration, modeling, and machine learning.
I am Simon M. I am an Environmental Engineering Assignment Expert at matlabassignmentexperts.com. I hold a Ph.D. in Environmental Engineering, Glasgow University, UK. I have been helping students with their assignments for the past 8 years. I solve assignments related to Environmental Engineering.
Visit matlabassignmentexperts.com or email info@matlabassignmentexperts.com. You can also call on +1 678 648 4277 for any assistance with Environmental Engineering Assignments.
This document provides an overview of common functions and operations in R including getting help, input and output, data creation and manipulation, variable conversion and information, plotting, and dates and times. It summarizes key functions for loading and saving data, reading from and writing to files, creating vectors, matrices and data frames, subsetting and manipulating data, performing mathematical and statistical operations, working with strings, and creating basic plots. The document is intended as a quick reference guide to common tasks in R.
This reference card summarizes common R functions for getting help, inputting and outputting data, creating and manipulating data, extracting and selecting data, converting variables, getting variable information, and performing mathematical operations. It provides the syntax and brief description for many basic and commonly used R functions across these categories in 3 sentences or less per function.
Correlations and Scatterplots MS Excel Lesson 2 Grade 8.pptJoshuaCasas7
This document provides an introduction to Excel 2007 and covers basics such as organizing data into rows and columns, entering data into cells, and using formulas and functions. It discusses using Excel to store, analyze, and represent data graphically. Specific functions are introduced for calculating descriptive statistics like average, median, and standard deviation. The document also covers correlating variables using Pearson's r correlation coefficient and creating scatterplots to visually depict relationships between variables. An example calculates the correlation between hours studied for a class and the GPA earned, then creates a scatterplot to represent the relationship between these two variables.
Statistics is both the science of uncertainty and the technology.docxrafaelaj1
Statistics is both the science of uncertainty and the technology of extracting information from data.
A statistic is a summary measure of data.
Descriptive statistics are methods that describe and summarize data.
Microsoft Excel supports statistical analysis in two ways:
1. Statistical functions
2. Analysis Toolpak add-in
Statistical Methods for Summarizing Data
A frequency distribution is a table that shows the number of observations in each of several nonoverlapping groups.
Categorical variables naturally define the groups in a frequency distribution.
To construct a frequency distribution, we need only count the number of observations that appear in each category.
This can be done using the Excel COUNTIF function.
Frequency Distributions for Categorical Data
Example 3.16: Constructing a Frequency Distribution for Items in the Purchase Orders Database
List the item names in a column on the spreadsheet.
Use the function =COUNTIF($D$4:$D$97,cell_reference), where cell_reference is the cell containing the item name
Example 3.16: Constructing a Frequency Distribution for Items in the Purchase Orders Database
Construct a column chart to visualize the frequencies.
Relative frequency is the fraction, or proportion, of the total.
If a data set has n observations, the relative frequency of category i is:
We often multiply the relative frequencies by 100 to express them as percentages.
A relative frequency distribution is a tabular summary of the relative frequencies of all categories.
Relative Frequency Distributions
Example 3.17: Constructing a Relative Frequency Distribution for Items in the Purchase Orders Database
First, sum the frequencies to find the total number (note that the sum of the frequencies must be the same as the total number of observations, n).
Then divide the frequency of each category by this value.
For numerical data that consist of a small number of discrete values, we may construct a frequency distribution similar to the way we did for categorical data; that is, we simply use COUNTIF to count the frequencies of each discrete value.
Frequency Distributions for Numerical Data
In the Purchase Orders data, the A/P terms are all whole numbers 15, 25, 30, and 45.
Example 3.18: Frequency and Relative Frequency Distribution for A/P Terms
A graphical depiction of a frequency distribution for numerical data in the form of a column chart is called a histogram.
Frequency distributions and histograms can be created using the Analysis Toolpak in Excel.
Click the Data Analysis tools button in the Analysis group under the Data tab in the Excel menu bar and select Histogram from the list.
Excel Histogram Tool
Specify the Input Range corresponding to the data. If you include the column header, then also check the Labels box so Excel knows that the range contains a label. The Bin Range defines the groups (Excel calls these “bins”) used for the frequency distribution.
Histogra.
This document provides a concise reference card summarizing common functions and operations in R including getting help, input/output, data creation and manipulation, variable conversion and information, plotting, and dates/times. It covers essential functions for loading and saving data, accessing documentation, basic math operations, creating matrices and data frames, subsetting and selecting data, and creating basic plots and graphs.
This document provides a concise reference card summarizing common functions and operations in R including getting help, input and output, data creation, slicing and extracting data, variable conversion and information, data selection and manipulation, math operations, matrices, strings, dates and times, and plotting. It covers essential functions for loading and saving data, accessing documentation, basic data manipulation and common statistical analyses in R.
1. The document discusses various methods for summarizing and exploring data, including plotting data in histograms and stem-and-leaf plots, calculating descriptive statistics like the mean, median and mode, and measuring variability using statistics like the range, mean absolute deviation, variance and standard deviation.
2. Key terms defined include measures of central tendency, variability, population parameters, unbiased estimators, degrees of freedom, and resistance of estimators.
3. Examples are provided to demonstrate how to calculate descriptive statistics and choose the best measures of central tendency and variability for different distributions.
This document provides an overview of different methods for displaying and describing data, including graphs, measures of center, and measures of spread. It discusses bar graphs, pie charts, stem-and-leaf plots, histograms, and calculating the mean, median, quartiles, interquartile range, and standard deviation. Examples are provided using data on football scores to demonstrate these concepts. Key terms like outliers, shape, and transformations of data are also introduced.
This document provides a reference card summarizing many important functions in R for getting help, input/output, data creation and manipulation, mathematics, plotting, dates and times, strings, matrices, and advanced data processing. It describes what each function does and how to use it in just a few words.
This document provides an introduction to using Excel for data analysis in Psych 209. It covers Excel basics like organizing data into rows and columns. It then demonstrates how to enter and analyze data, including using formulas and functions to calculate descriptive statistics. Finally, it shows how to calculate and interpret correlations between variables using Pearson's r, and how to create a scatterplot to visually depict relationships between variables. Key functions covered include AVERAGE, MEDIAN, MODE, VARP, STDEVP, and PEARSON.
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfleebarnesutopia
So… you want to become a Test Automation Engineer (or hire and develop one)? While there’s quite a bit of information available about important technical and tool skills to master, there’s not enough discussion around the path to becoming an effective Test Automation Engineer that knows how to add VALUE. In my experience this had led to a proliferation of engineers who are proficient with tools and building frameworks but have skill and knowledge gaps, especially in software testing, that reduce the value they deliver with test automation.
In this talk, Lee will share his lessons learned from over 30 years of working with, and mentoring, hundreds of Test Automation Engineers. Whether you’re looking to get started in test automation or just want to improve your trade, this talk will give you a solid foundation and roadmap for ensuring your test automation efforts continuously add value. This talk is equally valuable for both aspiring Test Automation Engineers and those managing them! All attendees will take away a set of key foundational knowledge and a high-level learning path for leveling up test automation skills and ensuring they add value to their organizations.
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...DanBrown980551
This LF Energy webinar took place June 20, 2024. It featured:
-Alex Thornton, LF Energy
-Hallie Cramer, Google
-Daniel Roesler, UtilityAPI
-Henry Richardson, WattTime
In response to the urgency and scale required to effectively address climate change, open source solutions offer significant potential for driving innovation and progress. Currently, there is a growing demand for standardization and interoperability in energy data and modeling. Open source standards and specifications within the energy sector can also alleviate challenges associated with data fragmentation, transparency, and accessibility. At the same time, it is crucial to consider privacy and security concerns throughout the development of open source platforms.
This webinar will delve into the motivations behind establishing LF Energy’s Carbon Data Specification Consortium. It will provide an overview of the draft specifications and the ongoing progress made by the respective working groups.
Three primary specifications will be discussed:
-Discovery and client registration, emphasizing transparent processes and secure and private access
-Customer data, centering around customer tariffs, bills, energy usage, and full consumption disclosure
-Power systems data, focusing on grid data, inclusive of transmission and distribution networks, generation, intergrid power flows, and market settlement data
QA or the Highway - Component Testing: Bridging the gap between frontend appl...zjhamm304
These are the slides for the presentation, "Component Testing: Bridging the gap between frontend applications" that was presented at QA or the Highway 2024 in Columbus, OH by Zachary Hamm.
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...Fwdays
Direct losses from downtime in 1 minute = $5-$10 thousand dollars. Reputation is priceless.
As part of the talk, we will consider the architectural strategies necessary for the development of highly loaded fintech solutions. We will focus on using queues and streaming to efficiently work and manage large amounts of data in real-time and to minimize latency.
We will focus special attention on the architectural patterns used in the design of the fintech system, microservices and event-driven architecture, which ensure scalability, fault tolerance, and consistency of the entire system.
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillLizaNolte
HERE IS YOUR WEBINAR CONTENT! 'Mastering Customer Journey Management with Dr. Graham Hill'. We hope you find the webinar recording both insightful and enjoyable.
In this webinar, we explored essential aspects of Customer Journey Management and personalization. Here’s a summary of the key insights and topics discussed:
Key Takeaways:
Understanding the Customer Journey: Dr. Hill emphasized the importance of mapping and understanding the complete customer journey to identify touchpoints and opportunities for improvement.
Personalization Strategies: We discussed how to leverage data and insights to create personalized experiences that resonate with customers.
Technology Integration: Insights were shared on how inQuba’s advanced technology can streamline customer interactions and drive operational efficiency.
From Natural Language to Structured Solr Queries using LLMsSease
This talk draws on experimentation to enable AI applications with Solr. One important use case is to use AI for better accessibility and discoverability of the data: while User eXperience techniques, lexical search improvements, and data harmonization can take organizations to a good level of accessibility, a structural (or “cognitive” gap) remains between the data user needs and the data producer constraints.
That is where AI – and most importantly, Natural Language Processing and Large Language Model techniques – could make a difference. This natural language, conversational engine could facilitate access and usage of the data leveraging the semantics of any data source.
The objective of the presentation is to propose a technical approach and a way forward to achieve this goal.
The key concept is to enable users to express their search queries in natural language, which the LLM then enriches, interprets, and translates into structured queries based on the Solr index’s metadata.
This approach leverages the LLM’s ability to understand the nuances of natural language and the structure of documents within Apache Solr.
The LLM acts as an intermediary agent, offering a transparent experience to users automatically and potentially uncovering relevant documents that conventional search methods might overlook. The presentation will include the results of this experimental work, lessons learned, best practices, and the scope of future work that should improve the approach and make it production-ready.
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving
What began over 115 years ago as a supplier of precision gauges to the automotive industry has evolved into being an industry leader in the manufacture of product branding, automotive cockpit trim and decorative appliance trim. Value-added services include in-house Design, Engineering, Program Management, Test Lab and Tool Shops.
ScyllaDB is making a major architecture shift. We’re moving from vNode replication to tablets – fragments of tables that are distributed independently, enabling dynamic data distribution and extreme elasticity. In this keynote, ScyllaDB co-founder and CTO Avi Kivity explains the reason for this shift, provides a look at the implementation and roadmap, and shares how this shift benefits ScyllaDB users.
AI in the Workplace Reskilling, Upskilling, and Future Work.pptxSunil Jagani
Discover how AI is transforming the workplace and learn strategies for reskilling and upskilling employees to stay ahead. This comprehensive guide covers the impact of AI on jobs, essential skills for the future, and successful case studies from industry leaders. Embrace AI-driven changes, foster continuous learning, and build a future-ready workforce.
Read More - https://bit.ly/3VKly70
"What does it really mean for your system to be available, or how to define w...Fwdays
We will talk about system monitoring from a few different angles. We will start by covering the basics, then discuss SLOs, how to define them, and why understanding the business well is crucial for success in this exercise.
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
What is an RPA CoE? Session 1 – CoE VisionDianaGray10
In the first session, we will review the organization's vision and how this has an impact on the COE Structure.
Topics covered:
• The role of a steering committee
• How do the organization’s priorities determine CoE Structure?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
Essentials of Automations: Exploring Attributes & Automation ParametersSafe Software
Building automations in FME Flow can save time, money, and help businesses scale by eliminating data silos and providing data to stakeholders in real-time. One essential component to orchestrating complex automations is the use of attributes & automation parameters (both formerly known as “keys”). In fact, it’s unlikely you’ll ever build an Automation without using these components, but what exactly are they?
Attributes & automation parameters enable the automation author to pass data values from one automation component to the next. During this webinar, our FME Flow Specialists will cover leveraging the three types of these output attributes & parameters in FME Flow: Event, Custom, and Automation. As a bonus, they’ll also be making use of the Split-Merge Block functionality.
You’ll leave this webinar with a better understanding of how to maximize the potential of automations by making use of attributes & automation parameters, with the ultimate goal of setting your enterprise integration workflows up on autopilot.
Essentials of Automations: Exploring Attributes & Automation Parameters
Ad09 Anytl
1. Paper AD09-2009
%ANYTL: A Versatile Table/Listing Macro
Yang Chen, Forest Research Institute, Jersey City, NJ
ABSTRACT
Unlike traditional table macros, %ANTL has only 3 macro parameters which correspond to the row, column and
title/footnote of a table respectively) and can produce most typical tables/listings. It uses simple language, and
grammar similar to PROC TABUALTE which defines a table/listing with variable names and a few special
characters.
%ANYTL is very flexible which allows one to manipulate the summary result prior to outputting to a table. The big N,
column width, pagination and wrapping of the table can be either be automatically calculated or specified. This
macro has proven to be powerful while succinct and easy to use.
1. INTRODUCTION
In the pharmaceutical industry, numerous macros have been developed to simplify the process of tables/listings
production. Such macros typically have many parameters for specifying the content and layout of the table since
there are many factors to consider. In most cases, a series of macros are often needed to handle different tables
and data. . This dual complexity makes it very difficult for programmers to use most of the macros that are currently
available. . One has to frequently refer to manuals to ensure parameter names are typed in correctly.
In contrast to such traditional macros, %ANYTL uses only three parameters to describe the three dimensions of a
table, thus eliminating the need to remember any parameter names. Instead, it dissects the parameters to get the
descriptive information. Its simple grammar uses special characters such as ‘*’, ‘.’ and ‘( )’, similar to that used in
PROC TABULATE in SAS®. In order to understand the logic of %ANYTL, let’s revisit the structure and content of a
table/listing and the analysis data type.
1.1 THREE DIMENSIONS OF A TABLE/LISTING
No matter how complex a table/listing is, it can always be defined by 3 dimensions (row, column and table number).
In the example Table 1, the row definitions of the demographic table are Age and Sex; the column definition is
Treatment; and the table number definition is Table 1 with associated titles and footnotes.
Table 1
Demographic Characteristics
Safety Population
Demographic Treatment A Treatment B Total
Parameter (N=XXX) (N=XXX) (N=XXX)
Age (years)
Mean xx.x xx.x xx.x
SD x.xx x.xx x.xx
Median xx.x xx.x xx.x
Min, Max xx, xx xx, xx xx, xx
n xxx xxx xxx
Sex, n (%)
Male xx (xx.x) xx (xx.x) xx (xx.x)
Female xx (xx.x) xx (xx.x) xx (xx.x)
Text 1
Text 2 with ’ sign
1
2. Based on this important fact, the %ANYTL macro is designed to have 3 corresponding parameters:
%ANYTL(ROW_DEF, COL_DEF, TAB_DEF);
ROW_DEF provides the source dataset names, analysis variable names and type, class variable names if any, and
options such as display text, select range, statistics, etc.
COL_DEF provides the class variable names corresponding to the column/subcolumns of the statistical output.
Usually it will be the treatment variable, with or without subgroup class variable(s).
TAB_DEF provides the output table number so that %ANYTL can find the corresponding titles and footnotes in the
prepared title/footnote dataset and output the summary result to a table file.
1.2 THREE TYPES OF DATA
Another fact is that a table/listing contains 3 types of data: continuous data, categorical data and text. A few special
characters are used in %ANYTL to distinguish them.
• CONTINUOUS DATA
By default, a variable specified in ROW_DEF of %ANYTL is considered a continuous data. For example, to get the
continuous output for the Age section in Table 1, call the following:
%ANYTL(MYLIB.PROF.AGE, ..., ...)
This will command %ANYTL to calculate N, SD, Mean, Min, Max and Median of the continuous variable AGE from
PROF dataset from MYLIB library (library name optional). ‘.’ is the separator between the dataset name and variable
name.
• CATEGORICAL DATA
Categorical data is always summarized as counts with or without the corresponding percentages for each group. ‘/’
is used in %ANYTL to separate the numerator and denominator variable names, and thus indicate categorical data.
To produce the Sex section in Table 1, call the following:
%ANYTL(PROF.SEX/PROF.SAFETY(=1), ..., ...)
This above will calculate the count and percentage of records in each category of the categorical variable SEX from
the PROF dataset divided by number of records where SAFETY=1 from PROF dataset. Note that the denominator
variable can come from a different source dataset. Since the denominator variable comes from the same dataset
as the numerator variable, we can simplify things further by doing the following:
%ANYTL(PROF.SEX/SAFETY(=1), ..., ...)
The display of the categorical summary is conveniently controlled by the number of ‘/’ used.
VAR/1 -> X
VAR1/VAR2 -> X (%)
VAR1//VAR2 -> X/XXX
VAR1///VAR2 -> X/XXX (%)
By default, the count will be number of records in each category. In some cases, unique patients instead of records
should be counted. Under such situation, you can simply add ‘#’ to at the end of the statement.
%ANYTL(PROF.SEX/SAFETY(=1)#, ..., ...)
By default %ANYTL will count the number of records for all of the categories of a variable which is determined by its
associated format. Therefore, even if the count of a category is 0, you can specify the range that need to be
displayed simply by using the '=' sign in parentheses. For example,
2
3. %ANYTL(PROF.SEX(=1=2)/SAFETY(=1)#, ..., ...)
This means that male and female (WHERE SEX in (1, 2)) will be displayed.
• TEXT
Sometimes, text that is not based on the data is also displayed either as header for a summary or simply an
individual line in the table. To display this, simply use quotation marks or double quotation marks.
For example,
%ANYTL(PROF.AGE‘Age (years)’, ..., ...) adds the title of ‘Age (years)’ to the summary of AGE.
%ANYTL(‘Text 1’ “Text 2 with ’ sign”, ..., ...) will produce the last two lines in Table 1.
2. APPLICATION OF %ANYTL FOR TABLES
2.1. QUICK START: THREE BASIC TABLES
This section will demonstrate how %ANYTL can produce three tables that are commonly used in the pharmaceutical
industry quickly and easily.
• DEMOGRAPHIC TABLE
Table 1 is a typical demographic table, which contains a mixture of continuous and categorical data without class
variables. This can be produced with the following macro call:
%ANYTL(PROF.AGE‘Age (years)’
PROF.SEX‘Sex, n(%)’/PROF.SAFETY(=1)
‘Text 1’
“Text 2 with ’ sign”,
TREAT+TOTAL,
Table 1 ‘Demographic Parameter’);
The 1st parameter (ROW_DEF) commands %ANYTL to calculate the statistics of the continuous variable AGE from
the PROF dataset and then count each of the categories of the categorical variable SEX (from PROF dataset) as
well as calculate its percentages based on the safety population (SAFETY=1 in PROF dataset). In addition, the
macro adds two lines of text. Each individual item is separated by space(s).
Note that each analysis variable can come from different source datasets. In this example, since they all come from
the same dataset PROF, the dataset name then can be omitted after its first appearance. So this macro call can be
simplified to:
%ANYTL(PROF.AGE‘Age (years)’
SEX‘Sex, n(%)’/SAFETY(=1)
‘Text 1’
“Text 2 with’ sign”,
TREAT+TOTAL,
3
4. Table 1 ‘Demographic Parameter’);
The 2nd parameter (COL_DEF) commands %ANYTL to calculate the summaries by variable TREAT. ‘+TOTAL’ is a
reserved term and produces an additional total column.
The 3rd parameter (TAB_DEF) outputs the statistical summary to a table. By specifying the table number, %ANYTL
will find and display the corresponding titles and footnotes stored in a title_footnote text file. The corresponding
N=xxx under each treatment column will be automatically determined by reading the last line of title. In our
example, ‘Safety Population’ is the last line, and %ANYTL will find count of patients WHERE SAFETY=1 in the
PROF dataset. If there is some text -‘Demographic Parameter’ in the first column header in our example - type
it in a pair of quotation marks after the table number. If this is not done, the first column header will be blank.
%ANYTL will calculate the width of each column automatically if not specified (to specify the width, refer to section
2.2.6).
• HIERARCHICAL TABLE
A hierarchical table is another common table type, in which the summary statistics are calculated by some class
variables. In the example below (Table 2), the statistics of the continuous variable VALUE from the LAB dataset is
calculated by Clinical Lab Parameters (TEST) and then by Visit (VISIT).
Table 2
Hematology Results in Conventional Units
Safety Population
Laboratory Parameter Treatment A Treatment B
Visit (N=xx) (N=xx)
Hemoglobin (g/dL)
Baseline
Mean xx.x xx.x
SD x.xx x.xx
Median xx.xx xx.xx
Min, Max xx, xx xx, xx
n xxx xxx
Visit 1
Mean xx.x xx.x
SD x.xx x.xx
Median xx.xx xx.xx
Min, Max xx, xx xx, xx
n xxx xxx
......
Similar to PROC TABULATE, ‘*’ is used to indicate the class variable TEST and VISIT before the analysis variable.
The call will be:
%ANYTL(LAB.TEST*VISIT*VALUE,
TREAT,
Table 2 ‘Laboratory Parameter~ Visit’);
Don’t forget that if needed, you can always select the display range by adding (=...) option after the
corresponding variable name. For example, to display only Hemoglobin test at Baseline and Visit 1, call:
%ANYTL(LAB.TEST(=‘Hemoglobin (g/dL)’)*VISIT(=0=1)*VALUE,
TREAT,
Table 2 ‘Laboratory Parameter~ Visit’);
A hierarchical categorical table can be handled in the same fashion. For example, the nominator, denominator and
percentage of Table 3 are all calculated by VISIT.
4
5. Table 3
Abnormal Physical Examination
Safety Population
Treatmnet A Treatmnet B
Visit (N=XXX) (N=XXX)
Body System n/N1 (%) n/N1 (%)
Baseline
Cardiovascular x/xxx (x.xx) x/xxx (x.xx)
Respiratory x/xxx (x.xx) x/xxx (x.xx)
Gastrointesinal x/xxx (x.xx) x/xxx (x.xx)
Skin x/xxx (x.xx) x/xxx (x.xx)
Visit 1
.... x/xxx (x.xx) x/xxx (x.xx)
To do this, simply put the class variable VISIT* in both numerator part and denominator portion:
%ANYTL(PE.VISIT*PETERM///VISIT*SAFETY(=1)#,
TREAT,
Table 3 ‘Visit Body System’);
• ADVERSE EVENT (AE) TABLE
An AE table is a special type of hierarchical table (see Table 4) in that:
(1) The percentages will be calculated for every term level.
(2) Under each treatment column, there could be sub-columns for relationship or severity. If one subject has
multiple degrees of relationship (or severity), only the most related (or severe) AE incidence will be used in the
calculation..
Table 4
Incidence of Treatment Emergent Adverse Events
by Treatment Group, System Organ Class, Preferred Term, and Relationship to Study Drug
Safety Population
System Organ Class/ Treatment A Treatment B
Preferred Term (N=XXX) (N=XXX)
Unrelated Related Unrelated Related
n (%) n (%) n (%) n (%)
System Organ Class1 [bodsys] xx (xx.x) xx (xx.x) xx (xx.x) xx (xx.x)
Preferred Term1 [prefterm] xx (xx.x) xx (xx.x) xx (xx.x) xx (xx.x)
Preferred Term2 xx (xx.x) xx (xx.x) xx (xx.x) xx (xx.x)
System Organ Class2 xx (xx.x) xx (xx.x) xx (xx.x) xx (xx.x)
..........
In %ANYTL, , ‘**’ is used to distinguish AE tables from regular hierarchical tables. Here, the call will be:
%ANYTL(TEAE(WHERE=(SAFETY=1 and TEAE=1)).BODSYS**PRETERM/PROF.SAFETY(=1)#,
TREAT**RELATED,
Table 4 ‘System Organ Class/Preferred Term’);
The ** in ‘BODSYS**PRETERM’ in the 1st parameter commands %ANYTL to produce the TEAE count by both
system organ class (BODSYS) and Preferred Term (PREFTERM) for the safety population. ‘/SAFETY(=1)’ tells
%ANYTL that the denominator of the percentage is number of subjects in the safety population from PROF dataset.
TREAT**RELATED in the 2nd parameter indicates that there are subcolumns for relationship under each treatment
column, and only most related incidence will be counted if there are multiple incidences of a AE for a subject. If
5
6. there are multiple levels of subcolumns, express it in the same way. For example, PROTOCOL*TREAT**RELATED
will produce three level subcolumns.
2.2 ADVANCED TABLE APPLICATION
Most tables, including the three basic table types, can be produced using the methods described above. In this
section, some advanced features of %ANYTL will be introduced.
2.2.1 MODIFY THE RESULT BEFORE OUTPUTTING TO A TABLE
A very unique feature of %ANYTL is that it allows one to modify the summary dataset before outputting it to a table
which will accommodate non-standard requests. To do this, simply leave the 3rd parameter (TAB_DEF) empty and
%ANYTL will output a summary dataset named _ANYSUM instead of outputting a table. After the DATA STEP
manipulation, call %ANYTL again, and fill in the 3rd parameter - a table will be output.
For example, in Table 1, when the percentage is 100.0%, the default display format of %ANYTL is (100). %ANYTL
does not have an option to display it as (100.0). Alternatively, it can be easily achieved using a datastep after
implementing the macro. The steps to demonstrate this modification are shown below:
First produce the output dataset (Note that the 3rd parameter is now blank):
%ANYTL(PROF.AGE‘Age (years)’ SEX‘Sex, n(%)’/SAFETY(=1),
TREAT+TOTAL, );
A dataset named _ANYSUM will be produced, which looks like:
_NAME_ _1 _2 _3
Age (years)
Mean 41.3 38.7 40.2
SD 6.7 4.3 5.2
Median 38.8 42.5 41.1
Min, Max 35, 52 36, 55 35, 55
n 50 50 100
Sex, n (%)
Male 50 (100) 10 (20.0) 60 (60.0)
Female 0 40 (80.0) 40 (40.0)
Each variable corresponds to a column that will be displayed in the table. So you can use DATA STEP to modify the
dataset _ANYSUM to get the desired result:
DATA _ANYSUM;
SET _ANYSUM;
_1=TRANWRD(_1, ‘(100)’, ‘(100.0)’);
RUN;
Then call %ANYTL again (this time leave the fist 2 parameters blank):
%ANYTL( , , Table 1 ‘Demographic Parameter’);
Table 1 will be output, now using the modified result:
Table 1
Demographic Characteristics
Safety Population
Demographic Treatment A Treatment B Total
Parameter (N=XXX) (N=XXX) (N=XXX)
Age (years)
6
7. Mean 41.3 38.7 40.2
SD 6.7 4.3 5.2
Median 38.8 42.5 41.1
Min, Max 35, 52 36, 55 35, 55
n 50 50 100
Sex, n (%)
Male 50 (100.0) 10 (20.0) 60 (60.0)
Female 0 40 (80.0) 40 (40.0)
2.2.2 STATISTICAL MODEL
Statistics like P-value, CI, etc. are often needed. %ANYTL has a statistical model library for common statistical
methods. Simply add the corresponding statistical model name after the response variable, separated by ‘+’. For
example, to produce Table 1b, call %ANYTL as:
%ANYTL(PROF.AGE‘Age (years)’+wilcoxon
SEX‘Sex, n(%)’/SAFETY(=1)+fisher,
TREAT+TOTAL,
Table 1b ‘Demographic Parameter’);
Then p-value of Wilcoxon Test for AGE and the Fisher’s Exact Test for SEX will be added.
Table 1b
Demographic Characteristics
Safety Population
Demographic Treatment A Treatment B Total
Parameter (N=XXX) (N=XXX) (N=XXX) P-Value
Age (years)
Mean xx.x xx.x xx.x 0.xxx
SD x.xx x.xx x.xx
Median xx.x xx.x xx.x
Min, Max xx, xx xx, xx xx, xx
n xxx xxx xxx
Sex, n (%)
Male xx (xx.x) xx (xx.x) xx (xx.x) 0.xxx
Female xx (xx.x) xx (xx.x) xx (xx.x)
2.2.3 PARALLEL SUMMARIES
Table 5
Vital Signs
Safety Population
Treatment A Treatment B
Visit
(N=XXX) (N=XXX)
Statistics
Baseline Actual Change Baseline Actual Change
Day 1
Mean xx.x xx.x xx.x xx.x xx.x xx.x
SD x.xx x.xx x.xx x.xx x.xx x.xx
Median xx xx xx xx xx xx
Min, Max xx, xx xx, xx xx, xx xx, xx xx, xx xx, xx
n xxx xxx xxx xxx xxx xxx
In Table 5, the subcolumns Baseline, Actual and Change are based on 3 variables : BSLVAL, VALUE and
CHANGE respectively. %ANYTL uses ‘|’ to separate them. The macro call is:
%ANYTL( VITAL.VISIT*BSLVAL|VALUE|CHANGE,
TREAT,
7
8. Table 5 ‘Visit Statistics’)
Note that BSLVAL, VALUE and CHANGE are the analysis variables, not the class variables. Therefore they are
placed in the 1st parameter (ROW_DEF), not the 2nd parameter (COL_DEF).
2.2.4 GROUP VARIABLES TOGETHER
Table 1c
Demographic Characteristics
Safety Population
Treatment A Treatment B Total
Demographic Parameter (N=XXX) (N=XXX) (N=XXX)
Race, n (%)
White xx (xx.x) xx (xx.x) xx (xx.x)
American Indian or Alaska Native xx (xx.x) xx (xx.x) xx (xx.x)
Asian xx (xx.x) xx (xx.x) xx (xx.x)
Black or African American xx (xx.x) xx (xx.x) xx (xx.x)
Pacific Islander xx (xx.x) xx (xx.x) xx (xx.x)
Sometimes the results for several variables need to be grouped together. In Table 1c, the categories of race
actually come from different variables (WHITE, INDIAN, ASIAN, BLACK and PACIF, value=1 or
missing). In %ANYTL a parenthesis can be used and the results will be grouped together:
%ANYTL(PROF.( ‘Race, n (%)’
WHITE‘ White’/SAFETY(=1)
INDIAN‘ American Indian or Alaska Native’/SAFETY(=1)
ASIAN‘ Asian’/SAFETY(=1)
BLACK‘ Black or African American’/SAFETY(=1)
PACIF‘ Pacific islander’/SAFETY(=1)),
TREAT,
Table 1c ‘Demographic Parameter’);
Since the denominators for these variables are identical, the call can be further simplified:
%ANYTL(PROF.( ‘Race, n (%)’
WHITE‘ White’
INDIAN‘ American Indian or Alaska Native’
ASIAN‘ Asian’
BLACK‘ Black or African American’
PACIF‘ Pacific islander’)/SAFETY(=1),
TREAT,
Table 1c ‘Demographic Parameter’);
2.2.5 MORE THAN ONE TEXT COLUMNS
Table 6
Response by Region and Country
Safety Population
Treatment A Treatment B
Region/Country Response n (%) n (%)
Eastern Europe
Romania N XXX XXX
Cure xxx (xx.x) xxx (xx.x)
Failure xxx (xx.x) xxx (xx.x)
8
9. Indeterminate xxx (xx.x) xxx (xx.x)
Russia
...
In hierarchical Table 6, there are 2 header columns. The first column are the class variables (REGION and
COUNTRY) and the second column is for the n and the response categories under each country. To create 2
columns, insert ‘-‘ after the class variable(s) in the first column. Parentheses are used to group the statistics by
region and country. Correspondingly, the header texts (‘Region/Country’,‘ Response’) for the two columns
are separated by ‘-’ in the 3rd parameter. Call below:
%ANYTL(CLIN.REGION*COUNTRY-*(SAFETY‘N’(=1)/1
RESP/REGION*COUNTRY*SAFETY(=1)#),
TREAT,
Table 6 ‘Region/Country’-‘ Response’)
2.2.6 SPECIFY THE COLUMN WIDTH
%ANYTL will automatically calculate the width of each column in the table and flow the text if needed. However, you
can choose to specify the width of the header columns after the corresponding column header text in the 3rd
parameter. For example, to limit the width of the first column to be 20 characters and the second column to be 25 in
Table 6, call the following:
%ANYTL(CLIN.REGION*COUNTRY-*(SAFETY‘N’(=1)/1
RESP/REGION*COUNTRY*SAFETY(=1)#),
TREAT,
Table 6 ‘Region/Country’/20-‘ Response’/25);
2.2.7 SORTING AE TABLE
You can specify how each AE term level should be sorted in after each term level name. There are 4 options:
AC: ascending alphabetically, can be omitted since it is the default
DC: decending alphabetically
AN: ascending by count(s) of specified column(s)
DN: decending by count(s) of specified column(s)
For example, you can create a AE table, in which every AE term is sorted differently.
%ANYTL( TEAE.TERM1[dc]**TERM2[an(=2)]**TERM3[dn(=2)**(=2=3)]/PROF.SAFETY(=1)#,
TREAT**SEVERITY,
Table xx);
TERM1[dc] means TERM1 will be sorted descending alphabetically.
TERM2[an(=2)] means TERM2 will be sorted by the sum of the counts of the columns under TERM =2.
TERM3[dn(=2)**(=2=3)] means TERM3 will be sorted by the sum of the counts of the columns of TERM =2 and
SEVERITY in (2, 3).
2.2.8 MULTIPLE PAGES
9
10. If there are too many columns to fit onto one page, they will have to be displayed in multiple pages. In %ANYTL, you
can choose either to specify the maximum column number allowed on each page or provide the page assignment
group variable in the 2nd parameter.
For example, %ANYTL(..., TREAT/4, ...) will put only 4 summary columns on each page.
%ANYTL(..., TREAT/PAGEGRP, ...) will put the summary columns on a different page defined by the group
variable PAGEGRP.
3 OUTPUT A LISTING QUICKLY
%ANYTL is also capable of producing a simple listing quickly. A listing is regarded as a table defined solely by
columns and table number in %ANYTL.
Listing 1
Subjects with at least one Serious AE
Safety Population
Demographic Information
Date of first/ Preferred Term/
Treatment Group Subject ID Age/Sex/Race last dose Investigator Term
Treatment A 001 25/Male/Asian 2001-01-01/ Pref. Term 1/
2001-07-02 Inv. Term2
002 35/Female/Caucasion 2001-02-02/ Pref. Term 1/
2001-08-02 Inv. Term2
...
Treatment B 007 55/Female/Asian 2001-03-03/ Pref. Term 1/
2001-09-02 Inv. Term2
008 25/Male/Asian 2001-04-04/ Pref. Term 1/
2001-10-02 Inv. Term2
...
......
To produce the example Listing 1, call the following:
%ANYTL( ,
AE(WHERE=(SAE=1)).(‘Demographic Information’
TREAT‘Treatment Group’
PID‘Subject ID’
(AGED/ SEX/~RACEC)‘Age/Sex/Race’
(FDSDT/~LDSDT)‘Date of First/Last Dose of Study Drug’)
(PT91/~ AE)‘Preferred Term/Investigator Term’/20or,
Listing 1);
The 1st parameter is left blank to indicate this is a listing.
In the 2nd parameter, dataset name (AE) and column variable names are separated with ‘.’.
Each column is described as VARNAME‘header text’/options and separated with each other by spaces.
If there is a column header that spans multiple columns (‘Demographic Information’ in the example), put the
text (in quotation marks) before the column items and use parentheses as COLUMN statement in PROC REPORT.
If a column is concatenated by several variables, simply put them within a set of parenthesis. All the special
characters between the variables will be kept. For example, the 3rd column ‘(AGED/ SEX/~RACEC)’ is equivalent
to:
10
11. var3=PUT(AGED, best.)|| ‘/’||‘ ’||PUT(SEX, sexf.)||‘/~’||RACEC;
If the variables need to be ordered but not displayed, put ‘/np’ option after the variable name (=NOPRINT option).
If the variables need to be ordered and grouped, put ‘/or’ option after the variable name of the LAST group
variable. If the width of a column needs to be fixed, put the width after it. Otherwise, it will be automatically
determined.
For example, the 5th column (PT91/ AE)‘Preferred Term/Investigator Term’/20or is equivalent to :
Var5=STRIP(PT91)||‘/~ ’||STRIP(AE); ...
DEFINE VAR5/‘Preferred Term/Investigator Term’ width=20 order order=internal;
CONCLUSION
Only a portion of the usage details is presented here. For example, %ANYTL can also be a useful tool for validating
tables (refer to my ‘Three Step Validation’, Pharmasug, 2007). But this paper does not intend to serve as a user
manual since %ANYTL is designed to fit my company table template and may not be directly applicable for other
companies. However, the concept of the three dimensions of Tables/listings and using special characters to indicate
each component to describe the table structure can be universally applied. As long as your Table/Listings follow a
relatively standard template, you can create similar macro.
Admittedly it was a challenge to write this macro compared with traditional macros since it must use basic SAS®
macro functions such as %SCAN, %LENGTH and DATA _NULL_ etc to dissect the parameters before actually
analyzing the source datasets. In addition, there were numerous issues to consider since lots of information is not
explicitly provided. Type and format of variables, quoted special characters, alignment, missing data handling, order
of each item etc added to the complexity. Nonetheless, the advantages of %ANYTL is apparent to all programmers
in the pharmaseutical industry:
%ANYTL is a single macro, not a series of macros and the grammar is consistent, logical and easy to
follow. You will never have to memorize numerous parameter names any more.
Instead of being a black box, %ANYTL allows you to predict the Table output even just by reading the code.
Thus it is comprehensible for any %ANYTL user.
Code in %ANYTL is very succinct and minimizes the task of pre-handling data before table production.
It is very versatile. You can specify different source datasets, different denominators, selection ranges,
sorting orders and display styles for each item in the table. The result can also be modified before
outputting to a Table. These tasks are usually awkward to accomplish in traditional macros. Therefore, this
single macro can handle most table/listing tasks that, in the past, were performed by several complicated
macros.
It is expandable. Without a doubt, there will be new studies and new requests that the current version
cannot handle. %ANYTL is built is such a manner that I can always add more symbols or symbol
combinations, functions and statistics to meet the need of the evolving industry.
REFERENCES
Three Step Table Validation, Yang Chen, Pharmasug, 2007
CONTACT INFORMATION
Your comments and questions are valued and encouraged. Contact the author at:
Name: Yang Chen
Enterprise: Forest Research Institute
Address: Forest Lab, Financial Plaza V
11
12. City, State ZIP: Jersey City, NJ 07311
Work Phone: (201) 253-6625
E-mail: snail_chen@hotmail.com
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS
Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are trademarks of their respective companies.
12