SlideShare a Scribd company logo
1 of 22
Download to read offline
QUIRKS OF R
BAY AREA USER GROUP MEETUP
SEPTEMBER 13, 2016
|Ankur Gupta  @ankurio
WHICH PACKAGE LOADS?
> library(ggplot2)
> # ggplot2 loads
WHICH PACKAGE LOADS?
> library("ggplot2")
> # ggplot2 loads
WHICH PACKAGE LOADS?
dplyr <- "ggplot2"
> library(dplyr)
Attaching package: ‘dplyr’
The following object is masked from ‘package:stats’:
filter
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
> # dplyr loads!
R parses unquoted text.
This is non-standard evaluation or NSE .
NON-STANDARD EVALUATION (NSE)
HAPPENS EVERYWHERE
base rm, ls, subset
utils demo, example
graphics plot
ggplot2 aes
plyr summarize
dplyr filter, select, arrange, summarize
We will focus on subset().
subset()USES NSE
> pop.df
state year pop
1 CA 2005 37
2 WI 2005 6
3 CA 2015 39
4 WI 2015 6
> subset(pop.df, year == 2015)
state year pop
3 CA 2015 39
4 WI 2015 6
yearis the column name (in dataframe scope)
subset()USES NSE WITH MIXED SCOPING
> pop.df
state year pop
1 CA 2005 37
2 WI 2005 6
3 CA 2015 39
4 WI 2015 6
> x <- 2015
> subset(pop.df, year == x)
state year pop
3 CA 2015 39
4 WI 2015 6
yearis the column name (in dataframe scope)
xis the global variable (global scope)
subset()CAN'T DIFFERENTIATE SCOPES
> pop.df
state year pop
1 CA 2005 37
2 WI 2005 6
3 CA 2015 39
4 WI 2015 6
results.by.year.1 <- function(df, year) {
df.subset <- subset(df, year == year)
# Do some computation
return_value <- nrow(df.subset)
return(return_value)
}
> results.by.year.1(pop.df, 2015)
[1] 4
> # Incorrect result. Silent mistake! Correct result is 2.
Using yearas an argument name is a bad idea.
Let's change it.
APPARENT FIX
> pop.df
state year pop
1 CA 2005 37
2 WI 2005 6
3 CA 2015 39
4 WI 2015 6
results.by.year.2 <- function(df, yr) {
df.subset <- subset(df, year == yr)
# Do some computation
return_value <- nrow(df.subset)
return(return_value)
}
> results.by.year.2(pop.df, 2015)
[1] 2
> # Correct result
This doesn't solve the problem.
It only hides the problem.
HIDDEN PROBLEM WITH subset()
> pop.df
state year pop yr rate
1 CA 2005 37 2005 4.01
2 WI 2005 6 1000 2.00
3 CA 2015 39 3000 6.00
4 WI 2015 6 5000 10.00
results.by.year.2 <- function(df, yr) {
df.subset <- subset(df, year == yr)
# Do some computation
return_value <- nrow(df.subset)
return(return_value)
}
> results.by.year.2(pop.df, 2015)
[1] 1
> # Incorrect result. Silent mistake! Correct result is 2.
How do we choose the second argument name for
results.by.year.x()?
DIRTY FIX
> pop.df
state year pop yr rate
1 CA 2005 37 2005 4.01
2 WI 2005 6 1000 2.00
3 CA 2015 39 3000 6.00
4 WI 2015 6 5000 10.00
results.by.year.3 <- function(df, yr) {
testthat::expect_false("yr" %in% names(df))
df.subset <- subset(df, year == yr)
# Do some computation
return_value <- nrow(df.subset)
return(return_value)
}
> results.by.year.3(pop.df, 2015)
Error: "yr" %in% names(df) is not false
> # Throws error. At least it is not a silent mistake.
This will stop code execution but it won't make a silent
mistake.
BETTER FIX
> pop.df
state year pop yr rate
1 CA 2005 37 2005 4.01
2 WI 2005 6 1000 2.00
3 CA 2015 39 3000 6.00
4 WI 2015 6 5000 10.00
results.by.year.4 <- function(df, yr) {
df.subset <- df[df[["year"]] == yr, , drop = FALSE]
# Do some computation
return_value <- nrow(df.subset)
return(return_value)
}
> results.by.year.4(pop.df, 2015)
[1] 2
> # Correct result
Don't use subset(). Use logical indexing.
(Remember to use drop = FALSE)
WHAT ABOUT filterFUNCTIONS IN dplyr?
dplyrcontains two functions that do the job of
subset():
filter()- uses NSE
filter_()- uses SE
Do these solve the problem?
filter()WITH SINGLE SCOPE
> pop.df
state year pop yr rate
1 CA 2005 37 2005 4.01
2 WI 2005 6 1000 2.00
3 CA 2015 39 3000 6.00
4 WI 2015 6 5000 10.00
> # NSE function, single scope
> dplyr::filter(pop.df, year == 2015)
state year pop yr rate
1 CA 2015 39 3000 6
2 WI 2015 6 5000 10
> # Correct result
filter()WITH MIXED SCOPE
> pop.df
state year pop yr rate
1 CA 2005 37 2005 4.01
2 WI 2005 6 1000 2.00
3 CA 2015 39 3000 6.00
4 WI 2015 6 5000 10.00
> # NSE function, mixed scope
> year <- 2015
> dplyr::filter(pop.df, year == year)
state year pop yr rate
1 CA 2005 37 2005 4.01
2 WI 2005 6 1000 2.00
3 CA 2015 39 3000 6.00
4 WI 2015 6 5000 10.00
> # Incorrect result
filter_()WITH SINGLE SCOPE
> pop.df
state year pop yr rate
1 CA 2005 37 2005 4.01
2 WI 2005 6 1000 2.00
3 CA 2015 39 3000 6.00
4 WI 2015 6 5000 10.00
> # SE function, single scope
> dplyr::filter_(pop.df, "year == 2015")
state year pop yr rate
1 CA 2015 39 3000 6
2 WI 2015 6 5000 10
> # Correct result
filter_()WITH MIXED SCOPE
> pop.df
state year pop yr rate
1 CA 2005 37 2005 4.01
2 WI 2005 6 1000 2.00
3 CA 2015 39 3000 6.00
4 WI 2015 6 5000 10.00
> # SE function, mixed scope
> year <- 2015
> dplyr::filter_(pop.df, "year == year")
state year pop yr rate
1 CA 2005 37 2005 4.01
2 WI 2005 6 1000 2.00
3 CA 2015 39 3000 6.00
4 WI 2015 6 5000 10.00
> # Incorrect result
Using a SE function by itself does not solve the problem of
mixed scope.
PROBLEM OF MIXED SCOPE REMAINS
> pop.df
state year pop yr rate
1 CA 2005 37 2005 4.01
2 WI 2005 6 1000 2.00
3 CA 2015 39 3000 6.00
4 WI 2015 6 5000 10.00
results.by.year.5 <- function(df, yr) {
df.subset <- dplyr::filter_(pop.df, "year == yr")
# Do some computation
return_value <- nrow(df.subset)
return(return_value)
}
> results.by.year.5(pop.df, 2015)
[1] 1
> # Incorrect result. Silent mistake! Correct result is 2.
yearis the column name (in dataframe scope)
yris in both in dataframe scope and function scope
FIX USING CONSTRUCTED CONDITION
> pop.df
state year pop yr rate
1 CA 2005 37 2005 4.01
2 WI 2005 6 1000 2.00
3 CA 2015 39 3000 6.00
4 WI 2015 6 5000 10.00
results.by.year.6 <- function(df, yr) {
condition <- lazyeval::interp(~year == x, x = yr)
df.subset <- dplyr::filter_(pop.df, .dots = condition)
# Do some computation
return_value <- nrow(df.subset)
return(return_value)
}
> results.by.year.6(pop.df, 2015)
[1] 2
> # Correct result
(can be slow; possible precision issues with doubles)
NSE + MIXED SCOPE = SILENT MISTAKES
# Dirty Fix - Put in assert-like statements
testthat::expect_false("yr" %in% names(df))
df.subset <- subset(df, year == yr)
# Fix 1 - Use SE with constructed condition
condition <- lazyeval::interp(~year == x, x = yr)
df.subset <- dplyr::filter_(pop.df, .dots = condition)
# Fix 2 - Use logical indexing
df.subset <- df[df[["year"]] == yr, , drop = FALSE]
TAKEAWAYS
NSE can cause problems with scope
Even with SE, mixed scope can cause problems
Add assert-like statements, especially with NSE
For non-interactive code, package writing, use logical
indexing instead
THANK YOU
Slides: tiny.cc/quirksofr
|Ankur Gupta  @ankurio

More Related Content

Viewers also liked (7)

параметринамузикалнитезвуци
параметринамузикалнитезвуципараметринамузикалнитезвуци
параметринамузикалнитезвуци
 
3D Interior Rendering of Living Rooms
3D Interior Rendering of Living Rooms3D Interior Rendering of Living Rooms
3D Interior Rendering of Living Rooms
 
3D Rendering of Global Brands
3D Rendering of Global Brands3D Rendering of Global Brands
3D Rendering of Global Brands
 
3D interior rendering of bedrooms
3D interior rendering of bedrooms3D interior rendering of bedrooms
3D interior rendering of bedrooms
 
_IRS Legacy Statement CI Division (paul c)
_IRS Legacy Statement CI Division (paul c)_IRS Legacy Statement CI Division (paul c)
_IRS Legacy Statement CI Division (paul c)
 
Murray Gabriel Djupe #2
Murray Gabriel Djupe #2Murray Gabriel Djupe #2
Murray Gabriel Djupe #2
 
Diapositivas presupuesto de compra
Diapositivas presupuesto de compraDiapositivas presupuesto de compra
Diapositivas presupuesto de compra
 

Similar to QuirksofR

E2 – Fundamentals, Functions & ArraysPlease refer to announcemen.docx
E2 – Fundamentals, Functions & ArraysPlease refer to announcemen.docxE2 – Fundamentals, Functions & ArraysPlease refer to announcemen.docx
E2 – Fundamentals, Functions & ArraysPlease refer to announcemen.docx
jacksnathalie
 
E2 – Fundamentals, Functions & ArraysPlease refer to announcements.docx
E2 – Fundamentals, Functions & ArraysPlease refer to announcements.docxE2 – Fundamentals, Functions & ArraysPlease refer to announcements.docx
E2 – Fundamentals, Functions & ArraysPlease refer to announcements.docx
shandicollingwood
 
Forecasting Revenue With Stationary Time Series Models
Forecasting Revenue With Stationary Time Series ModelsForecasting Revenue With Stationary Time Series Models
Forecasting Revenue With Stationary Time Series Models
Geoffery Mullings
 
Firebird 3 Windows Functions
Firebird 3 Windows  FunctionsFirebird 3 Windows  Functions
Firebird 3 Windows Functions
Mind The Firebird
 
Cis 170 c ilab 5 of 7 arrays and strings
Cis 170 c ilab 5 of 7 arrays and stringsCis 170 c ilab 5 of 7 arrays and strings
Cis 170 c ilab 5 of 7 arrays and strings
CIS321
 
SOFTWARE ENGINEERING - FINAL PRESENTATION Slides
SOFTWARE ENGINEERING - FINAL PRESENTATION SlidesSOFTWARE ENGINEERING - FINAL PRESENTATION Slides
SOFTWARE ENGINEERING - FINAL PRESENTATION Slides
Jeremy Zhong
 
Alv object model simple 2 d table - event handling
Alv object model   simple 2 d table - event handlingAlv object model   simple 2 d table - event handling
Alv object model simple 2 d table - event handling
anil kumar
 
College for Women Department of Information Science
College for Women  Department of Information Science  College for Women  Department of Information Science
College for Women Department of Information Science
WilheminaRossi174
 

Similar to QuirksofR (20)

E2 – Fundamentals, Functions & ArraysPlease refer to announcemen.docx
E2 – Fundamentals, Functions & ArraysPlease refer to announcemen.docxE2 – Fundamentals, Functions & ArraysPlease refer to announcemen.docx
E2 – Fundamentals, Functions & ArraysPlease refer to announcemen.docx
 
E2 – Fundamentals, Functions & ArraysPlease refer to announcements.docx
E2 – Fundamentals, Functions & ArraysPlease refer to announcements.docxE2 – Fundamentals, Functions & ArraysPlease refer to announcements.docx
E2 – Fundamentals, Functions & ArraysPlease refer to announcements.docx
 
Devry cis 170 c i lab 5 of 7 arrays and strings
Devry cis 170 c i lab 5 of 7 arrays and stringsDevry cis 170 c i lab 5 of 7 arrays and strings
Devry cis 170 c i lab 5 of 7 arrays and strings
 
Devry cis 170 c i lab 5 of 7 arrays and strings
Devry cis 170 c i lab 5 of 7 arrays and stringsDevry cis 170 c i lab 5 of 7 arrays and strings
Devry cis 170 c i lab 5 of 7 arrays and strings
 
Forecasting Revenue With Stationary Time Series Models
Forecasting Revenue With Stationary Time Series ModelsForecasting Revenue With Stationary Time Series Models
Forecasting Revenue With Stationary Time Series Models
 
Devry cis-170-c-i lab-5-of-7-arrays-and-strings
Devry cis-170-c-i lab-5-of-7-arrays-and-stringsDevry cis-170-c-i lab-5-of-7-arrays-and-strings
Devry cis-170-c-i lab-5-of-7-arrays-and-strings
 
Devry cis-170-c-i lab-5-of-7-arrays-and-strings
Devry cis-170-c-i lab-5-of-7-arrays-and-stringsDevry cis-170-c-i lab-5-of-7-arrays-and-strings
Devry cis-170-c-i lab-5-of-7-arrays-and-strings
 
Firebird 3 Windows Functions
Firebird 3 Windows  FunctionsFirebird 3 Windows  Functions
Firebird 3 Windows Functions
 
Logistic Regression, Linear and Quadratic Discriminant Analysis and K-Nearest...
Logistic Regression, Linear and Quadratic Discriminant Analysis and K-Nearest...Logistic Regression, Linear and Quadratic Discriminant Analysis and K-Nearest...
Logistic Regression, Linear and Quadratic Discriminant Analysis and K-Nearest...
 
Common Coding and Design mistakes (that really mess up performance)
Common Coding and Design mistakes (that really mess up performance)Common Coding and Design mistakes (that really mess up performance)
Common Coding and Design mistakes (that really mess up performance)
 
Assignment 2 lab 3 python gpa calculator
Assignment 2 lab 3  python gpa calculatorAssignment 2 lab 3  python gpa calculator
Assignment 2 lab 3 python gpa calculator
 
Cis 170 c ilab 5 of 7 arrays and strings
Cis 170 c ilab 5 of 7 arrays and stringsCis 170 c ilab 5 of 7 arrays and strings
Cis 170 c ilab 5 of 7 arrays and strings
 
IBM Informix dynamic server 11 10 Cheetah Sql Features
IBM Informix dynamic server 11 10 Cheetah Sql FeaturesIBM Informix dynamic server 11 10 Cheetah Sql Features
IBM Informix dynamic server 11 10 Cheetah Sql Features
 
Austin Bingham. Transducers in Python. PyCon Belarus
Austin Bingham. Transducers in Python. PyCon BelarusAustin Bingham. Transducers in Python. PyCon Belarus
Austin Bingham. Transducers in Python. PyCon Belarus
 
SOFTWARE ENGINEERING - FINAL PRESENTATION Slides
SOFTWARE ENGINEERING - FINAL PRESENTATION SlidesSOFTWARE ENGINEERING - FINAL PRESENTATION Slides
SOFTWARE ENGINEERING - FINAL PRESENTATION Slides
 
Alv object model simple 2 d table - event handling
Alv object model   simple 2 d table - event handlingAlv object model   simple 2 d table - event handling
Alv object model simple 2 d table - event handling
 
College for Women Department of Information Science
College for Women  Department of Information Science  College for Women  Department of Information Science
College for Women Department of Information Science
 
College for women department of information science
College for women  department of information science  College for women  department of information science
College for women department of information science
 
Devry cis 170 c i lab 5 of 7 arrays and strings
Devry cis 170 c i lab 5 of 7 arrays and stringsDevry cis 170 c i lab 5 of 7 arrays and strings
Devry cis 170 c i lab 5 of 7 arrays and strings
 
第7回 大規模データを用いたデータフレーム操作実習(1)
第7回 大規模データを用いたデータフレーム操作実習(1)第7回 大規模データを用いたデータフレーム操作実習(1)
第7回 大規模データを用いたデータフレーム操作実習(1)
 

QuirksofR

  • 1. QUIRKS OF R BAY AREA USER GROUP MEETUP SEPTEMBER 13, 2016 |Ankur Gupta  @ankurio
  • 2. WHICH PACKAGE LOADS? > library(ggplot2) > # ggplot2 loads
  • 3. WHICH PACKAGE LOADS? > library("ggplot2") > # ggplot2 loads
  • 4. WHICH PACKAGE LOADS? dplyr <- "ggplot2" > library(dplyr) Attaching package: ‘dplyr’ The following object is masked from ‘package:stats’: filter The following objects are masked from ‘package:base’: intersect, setdiff, setequal, union > # dplyr loads! R parses unquoted text. This is non-standard evaluation or NSE .
  • 5. NON-STANDARD EVALUATION (NSE) HAPPENS EVERYWHERE base rm, ls, subset utils demo, example graphics plot ggplot2 aes plyr summarize dplyr filter, select, arrange, summarize We will focus on subset().
  • 6. subset()USES NSE > pop.df state year pop 1 CA 2005 37 2 WI 2005 6 3 CA 2015 39 4 WI 2015 6 > subset(pop.df, year == 2015) state year pop 3 CA 2015 39 4 WI 2015 6 yearis the column name (in dataframe scope)
  • 7. subset()USES NSE WITH MIXED SCOPING > pop.df state year pop 1 CA 2005 37 2 WI 2005 6 3 CA 2015 39 4 WI 2015 6 > x <- 2015 > subset(pop.df, year == x) state year pop 3 CA 2015 39 4 WI 2015 6 yearis the column name (in dataframe scope) xis the global variable (global scope)
  • 8. subset()CAN'T DIFFERENTIATE SCOPES > pop.df state year pop 1 CA 2005 37 2 WI 2005 6 3 CA 2015 39 4 WI 2015 6 results.by.year.1 <- function(df, year) { df.subset <- subset(df, year == year) # Do some computation return_value <- nrow(df.subset) return(return_value) } > results.by.year.1(pop.df, 2015) [1] 4 > # Incorrect result. Silent mistake! Correct result is 2. Using yearas an argument name is a bad idea. Let's change it.
  • 9. APPARENT FIX > pop.df state year pop 1 CA 2005 37 2 WI 2005 6 3 CA 2015 39 4 WI 2015 6 results.by.year.2 <- function(df, yr) { df.subset <- subset(df, year == yr) # Do some computation return_value <- nrow(df.subset) return(return_value) } > results.by.year.2(pop.df, 2015) [1] 2 > # Correct result This doesn't solve the problem. It only hides the problem.
  • 10. HIDDEN PROBLEM WITH subset() > pop.df state year pop yr rate 1 CA 2005 37 2005 4.01 2 WI 2005 6 1000 2.00 3 CA 2015 39 3000 6.00 4 WI 2015 6 5000 10.00 results.by.year.2 <- function(df, yr) { df.subset <- subset(df, year == yr) # Do some computation return_value <- nrow(df.subset) return(return_value) } > results.by.year.2(pop.df, 2015) [1] 1 > # Incorrect result. Silent mistake! Correct result is 2. How do we choose the second argument name for results.by.year.x()?
  • 11. DIRTY FIX > pop.df state year pop yr rate 1 CA 2005 37 2005 4.01 2 WI 2005 6 1000 2.00 3 CA 2015 39 3000 6.00 4 WI 2015 6 5000 10.00 results.by.year.3 <- function(df, yr) { testthat::expect_false("yr" %in% names(df)) df.subset <- subset(df, year == yr) # Do some computation return_value <- nrow(df.subset) return(return_value) } > results.by.year.3(pop.df, 2015) Error: "yr" %in% names(df) is not false > # Throws error. At least it is not a silent mistake. This will stop code execution but it won't make a silent mistake.
  • 12. BETTER FIX > pop.df state year pop yr rate 1 CA 2005 37 2005 4.01 2 WI 2005 6 1000 2.00 3 CA 2015 39 3000 6.00 4 WI 2015 6 5000 10.00 results.by.year.4 <- function(df, yr) { df.subset <- df[df[["year"]] == yr, , drop = FALSE] # Do some computation return_value <- nrow(df.subset) return(return_value) } > results.by.year.4(pop.df, 2015) [1] 2 > # Correct result Don't use subset(). Use logical indexing. (Remember to use drop = FALSE)
  • 13. WHAT ABOUT filterFUNCTIONS IN dplyr? dplyrcontains two functions that do the job of subset(): filter()- uses NSE filter_()- uses SE Do these solve the problem?
  • 14. filter()WITH SINGLE SCOPE > pop.df state year pop yr rate 1 CA 2005 37 2005 4.01 2 WI 2005 6 1000 2.00 3 CA 2015 39 3000 6.00 4 WI 2015 6 5000 10.00 > # NSE function, single scope > dplyr::filter(pop.df, year == 2015) state year pop yr rate 1 CA 2015 39 3000 6 2 WI 2015 6 5000 10 > # Correct result
  • 15. filter()WITH MIXED SCOPE > pop.df state year pop yr rate 1 CA 2005 37 2005 4.01 2 WI 2005 6 1000 2.00 3 CA 2015 39 3000 6.00 4 WI 2015 6 5000 10.00 > # NSE function, mixed scope > year <- 2015 > dplyr::filter(pop.df, year == year) state year pop yr rate 1 CA 2005 37 2005 4.01 2 WI 2005 6 1000 2.00 3 CA 2015 39 3000 6.00 4 WI 2015 6 5000 10.00 > # Incorrect result
  • 16. filter_()WITH SINGLE SCOPE > pop.df state year pop yr rate 1 CA 2005 37 2005 4.01 2 WI 2005 6 1000 2.00 3 CA 2015 39 3000 6.00 4 WI 2015 6 5000 10.00 > # SE function, single scope > dplyr::filter_(pop.df, "year == 2015") state year pop yr rate 1 CA 2015 39 3000 6 2 WI 2015 6 5000 10 > # Correct result
  • 17. filter_()WITH MIXED SCOPE > pop.df state year pop yr rate 1 CA 2005 37 2005 4.01 2 WI 2005 6 1000 2.00 3 CA 2015 39 3000 6.00 4 WI 2015 6 5000 10.00 > # SE function, mixed scope > year <- 2015 > dplyr::filter_(pop.df, "year == year") state year pop yr rate 1 CA 2005 37 2005 4.01 2 WI 2005 6 1000 2.00 3 CA 2015 39 3000 6.00 4 WI 2015 6 5000 10.00 > # Incorrect result Using a SE function by itself does not solve the problem of mixed scope.
  • 18. PROBLEM OF MIXED SCOPE REMAINS > pop.df state year pop yr rate 1 CA 2005 37 2005 4.01 2 WI 2005 6 1000 2.00 3 CA 2015 39 3000 6.00 4 WI 2015 6 5000 10.00 results.by.year.5 <- function(df, yr) { df.subset <- dplyr::filter_(pop.df, "year == yr") # Do some computation return_value <- nrow(df.subset) return(return_value) } > results.by.year.5(pop.df, 2015) [1] 1 > # Incorrect result. Silent mistake! Correct result is 2. yearis the column name (in dataframe scope) yris in both in dataframe scope and function scope
  • 19. FIX USING CONSTRUCTED CONDITION > pop.df state year pop yr rate 1 CA 2005 37 2005 4.01 2 WI 2005 6 1000 2.00 3 CA 2015 39 3000 6.00 4 WI 2015 6 5000 10.00 results.by.year.6 <- function(df, yr) { condition <- lazyeval::interp(~year == x, x = yr) df.subset <- dplyr::filter_(pop.df, .dots = condition) # Do some computation return_value <- nrow(df.subset) return(return_value) } > results.by.year.6(pop.df, 2015) [1] 2 > # Correct result (can be slow; possible precision issues with doubles)
  • 20. NSE + MIXED SCOPE = SILENT MISTAKES # Dirty Fix - Put in assert-like statements testthat::expect_false("yr" %in% names(df)) df.subset <- subset(df, year == yr) # Fix 1 - Use SE with constructed condition condition <- lazyeval::interp(~year == x, x = yr) df.subset <- dplyr::filter_(pop.df, .dots = condition) # Fix 2 - Use logical indexing df.subset <- df[df[["year"]] == yr, , drop = FALSE]
  • 21. TAKEAWAYS NSE can cause problems with scope Even with SE, mixed scope can cause problems Add assert-like statements, especially with NSE For non-interactive code, package writing, use logical indexing instead