- Rcpp is a package that facilitates interoperability between R and C++ by providing data structures and functions that make it easy to write C++ code that integrates with R. It has been released 54 times since 2008 with over 170 CRAN packages depending on it.
- Rcpp allows users to source C++ code from R using sourceCpp() and export C++ functions to R using attributes like // [[Rcpp::export]]. This improves performance over pure R code by leveraging fast C++ implementations.
- dplyr is a popular R package for data manipulation that achieves great performance through its use of Rcpp. Functions like arrange(), filter(), and summarise() are much faster when
In this InfluxDays NYC 2019 talk, InfluxData Founder & CTO Paul Dix will outline his vision around the platform and its new data scripting and query language Flux, and he will give the latest updates on InfluxDB time series database. This talk will walk through the vision and architecture with demonstrations of working prototypes of the projects.
Paul will outline his vision around the platform and give the latest updates on Flux (a new query language), the decoupling of query and storage, the impact of hybrid cloud environments on architecture, cardinality, and discuss the technical directions of the platform. This talk will walk through the vision and architecture with demonstrations of working prototypes of the projects.
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemSages
Introduction to Hadoop Map Reduce, Pig, Hive and Ambari technologies.
Workshop deck prepared and presented on September 5th 2015 by Radosław Stankiewicz.
During that the day participants had also the possibility to go through prepared tutorials and test their analysis on real cluster.
Knit, Chisel, Hack: Building Programs in Guile Scheme (Strange Loop 2016)Igalia
By Andy Wingo.
This talk makes the case that Guile is a delightful medium for making crafty programs, from the most ephemeral scripts to long-lived systems that you can rely on for years. Guile takes the elegant Scheme programming language, integrates it with the POSIX environments that you know and loathe and love, and wraps it all up in a responsive, hackable environment that nurtures programs from the small up to the large. Guile hacker will give you a gentle introduction to the language as they lead you through the process of building cool stuff in Scheme. With all this going for it, maybe you will choose to make your next program in Guile!
(c) Strange Loop 2016
http://www.thestrangeloop.com/2016/sessions.html
In this InfluxDays NYC 2019 talk, InfluxData Founder & CTO Paul Dix will outline his vision around the platform and its new data scripting and query language Flux, and he will give the latest updates on InfluxDB time series database. This talk will walk through the vision and architecture with demonstrations of working prototypes of the projects.
Paul will outline his vision around the platform and give the latest updates on Flux (a new query language), the decoupling of query and storage, the impact of hybrid cloud environments on architecture, cardinality, and discuss the technical directions of the platform. This talk will walk through the vision and architecture with demonstrations of working prototypes of the projects.
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemSages
Introduction to Hadoop Map Reduce, Pig, Hive and Ambari technologies.
Workshop deck prepared and presented on September 5th 2015 by Radosław Stankiewicz.
During that the day participants had also the possibility to go through prepared tutorials and test their analysis on real cluster.
Knit, Chisel, Hack: Building Programs in Guile Scheme (Strange Loop 2016)Igalia
By Andy Wingo.
This talk makes the case that Guile is a delightful medium for making crafty programs, from the most ephemeral scripts to long-lived systems that you can rely on for years. Guile takes the elegant Scheme programming language, integrates it with the POSIX environments that you know and loathe and love, and wraps it all up in a responsive, hackable environment that nurtures programs from the small up to the large. Guile hacker will give you a gentle introduction to the language as they lead you through the process of building cool stuff in Scheme. With all this going for it, maybe you will choose to make your next program in Guile!
(c) Strange Loop 2016
http://www.thestrangeloop.com/2016/sessions.html
Optimizing with persistent data structures (LLVM Cauldron 2016)Igalia
By Andy Wingo.
Is there life beyond phi variables and basic blocks? Andy will report on his experience using a new intermediate representation for compiler middle-ends, "CPS soup". The CPS soup language represents programs using Clojure-inspired maps, allowing optimizations to be neatly expressed as functions from one graph to another. Using these persistent data structures also means that the source program doesn't change while the residual program is being created, eliminating one class of problems that the optimizer writer has to keep in mind. Together we will look at some example transformations from an expressiveness as well as a performance point of view, and we will also cover some advantages which a traditional SSA graph maintains over CPS soup.
Good news, everybody! Guile 2.2 performance notes (FOSDEM 2016)Igalia
By Andy Wingo.
With the new compiler and virtual machine in Guile 2.2, Guile hackers need to update their mental performance models. This talk will give a bit of a state of the union of Guile performance, with an updated overview of the cost of various kinds of abstractions. Sometimes abstraction is free!
(c) 2016 FOSDEM VZW
CC BY 2.0 BE
https://archive.fosdem.org/2016/
We'll discuss our experiences with tooling aimed at finding and fixing performance problems in a production Rust application, as experienced through the eyes of somebody who's more familiar with the Go ecosystem but grew to love Rust. We'll cover CPU and Heap profiling, and also briefly touch causal profiling.
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and TelegrafInfluxData
Did you know you can use InfluxDB to monitor your BBQ and to ensure the tastiest results? Join this meetup to learn two different approaches to using a time series database to monitor a BBQ or a smoker. Learn how Will Cooke uses Python, MQTT, Telegraf and InfluxDB 2.0 to monitor his smoker and to gain insight into temperature changes, the stall, and other important stats about his brisket. Scott Anderson will demonstrate how he uses a FireBoard wireless thermometer, Telegraf and InfluxDB 2.0 to continuously work towards the perfect smoke.
Using Grafana with InfluxDB 2.0 and Flux Lang by Jacob LisiInfluxData
Flux, the new InfluxData data scripting and query language (formerly IFQL), super-charges queries both for analytics and data science. Jacob Lisi from Grafana Labs will give a quick overview of the language features as well as the moving parts for a working deployment. Grafana is an open source dashboard solution that shares Flux’s passion for analytics and data science. For that reason, they are very excited to showcase the new Flux support within Grafana, and a couple of common analytics use cases to get the most out of your data.
In this InfluxDays NYC 2019 talk, Jacob Lisi will share the latest updates they have made with their Flux builder in Grafana.
InfluxQL is a powerful query language for InfluxDB, and TICKScript is a domain specific language used by Kapacitor to define tasks involving the extraction, transformation and loading of data and also involving the tracking of arbitrary changes and detection of events within data. The combination of these two can make your monitoring apps powerful. During this session, InfluxData Engineer Michael DeSa will share best practices for using these powerful tools. Prerequisite: Intro To Kapacitor.
Taming the Tiger: Tips and Tricks for Using TelegrafInfluxData
Taming the Tiger: Tips and Tricks for Using Telegraf
As part of InfluxDays North America 2020 Virtual Experience, the Technical Services team will be offering a free live InfluxDB training to the first 100 registered attendees.This will be hosted over Zoom and Slack with two main trainers and there will be assistants to help participants with the course work. The training will be recorded and made available on the InfluxDays website and the InfluxData YouTube channel.
The course provides an introduction to using Telegraf within a hands-on lab setting. Attendees will be presented a series of lab exercises and get the chance to work through them with the assistance of our remote proctors. After taking this class, attendants will be able to:
Articulate the purposes and value of Telegraf
Understand the basics of configuring and running Telegraf
Understand how to manipulate incoming data to optimize InfluxDB schema
Visualize the insertion results using InfluxDB Cloud UI
function* - ES6, generators, and all that (JSRomandie meetup, February 2014)Igalia
By Andy Wingo.
Andy will talk about forthcoming iterator and generator in JS:
1. Generator and interator seen from a JS developer perspective. What it is, why should I care?
2. Generator and iteragtor seen by a JS engine developer perspective. What does it imply in term for C++, performance consideration, how different is it from what exists already...
3. What does it means to implement new features in V8 (question driven)
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...InfluxData
Giraffe is the open source React-based visualization library that powers data visualizations in the InfluxDB 2.0 UI. Giraffe can be used to display your data within your own app and is Fluxlang-supported! It uses algorithms to handle visualizing high volumes of time series data that InfluxDB can ingest and query.
Kristina Robinson, the engineering manager for the Giraffe team at InfluxData, will dive into:
The basics of using the Giraffe library including how to query your data with Flux
Specific Giraffe visualization types for dashboards (e.g. single number, table and graph)
How to incorporate visualizations in your own custom apps
You use InfluxData to monitor the performance of your infrastructure and apps—so it is equally important to keep your InfluxEnterprise instance up and running. Tim Hall, InfluxData VP of Products, will outline why and how you can monitor InfluxEnterprise with InfluxDB.
Optimizing with persistent data structures (LLVM Cauldron 2016)Igalia
By Andy Wingo.
Is there life beyond phi variables and basic blocks? Andy will report on his experience using a new intermediate representation for compiler middle-ends, "CPS soup". The CPS soup language represents programs using Clojure-inspired maps, allowing optimizations to be neatly expressed as functions from one graph to another. Using these persistent data structures also means that the source program doesn't change while the residual program is being created, eliminating one class of problems that the optimizer writer has to keep in mind. Together we will look at some example transformations from an expressiveness as well as a performance point of view, and we will also cover some advantages which a traditional SSA graph maintains over CPS soup.
Good news, everybody! Guile 2.2 performance notes (FOSDEM 2016)Igalia
By Andy Wingo.
With the new compiler and virtual machine in Guile 2.2, Guile hackers need to update their mental performance models. This talk will give a bit of a state of the union of Guile performance, with an updated overview of the cost of various kinds of abstractions. Sometimes abstraction is free!
(c) 2016 FOSDEM VZW
CC BY 2.0 BE
https://archive.fosdem.org/2016/
We'll discuss our experiences with tooling aimed at finding and fixing performance problems in a production Rust application, as experienced through the eyes of somebody who's more familiar with the Go ecosystem but grew to love Rust. We'll cover CPU and Heap profiling, and also briefly touch causal profiling.
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and TelegrafInfluxData
Did you know you can use InfluxDB to monitor your BBQ and to ensure the tastiest results? Join this meetup to learn two different approaches to using a time series database to monitor a BBQ or a smoker. Learn how Will Cooke uses Python, MQTT, Telegraf and InfluxDB 2.0 to monitor his smoker and to gain insight into temperature changes, the stall, and other important stats about his brisket. Scott Anderson will demonstrate how he uses a FireBoard wireless thermometer, Telegraf and InfluxDB 2.0 to continuously work towards the perfect smoke.
Using Grafana with InfluxDB 2.0 and Flux Lang by Jacob LisiInfluxData
Flux, the new InfluxData data scripting and query language (formerly IFQL), super-charges queries both for analytics and data science. Jacob Lisi from Grafana Labs will give a quick overview of the language features as well as the moving parts for a working deployment. Grafana is an open source dashboard solution that shares Flux’s passion for analytics and data science. For that reason, they are very excited to showcase the new Flux support within Grafana, and a couple of common analytics use cases to get the most out of your data.
In this InfluxDays NYC 2019 talk, Jacob Lisi will share the latest updates they have made with their Flux builder in Grafana.
InfluxQL is a powerful query language for InfluxDB, and TICKScript is a domain specific language used by Kapacitor to define tasks involving the extraction, transformation and loading of data and also involving the tracking of arbitrary changes and detection of events within data. The combination of these two can make your monitoring apps powerful. During this session, InfluxData Engineer Michael DeSa will share best practices for using these powerful tools. Prerequisite: Intro To Kapacitor.
Taming the Tiger: Tips and Tricks for Using TelegrafInfluxData
Taming the Tiger: Tips and Tricks for Using Telegraf
As part of InfluxDays North America 2020 Virtual Experience, the Technical Services team will be offering a free live InfluxDB training to the first 100 registered attendees.This will be hosted over Zoom and Slack with two main trainers and there will be assistants to help participants with the course work. The training will be recorded and made available on the InfluxDays website and the InfluxData YouTube channel.
The course provides an introduction to using Telegraf within a hands-on lab setting. Attendees will be presented a series of lab exercises and get the chance to work through them with the assistance of our remote proctors. After taking this class, attendants will be able to:
Articulate the purposes and value of Telegraf
Understand the basics of configuring and running Telegraf
Understand how to manipulate incoming data to optimize InfluxDB schema
Visualize the insertion results using InfluxDB Cloud UI
function* - ES6, generators, and all that (JSRomandie meetup, February 2014)Igalia
By Andy Wingo.
Andy will talk about forthcoming iterator and generator in JS:
1. Generator and interator seen from a JS developer perspective. What it is, why should I care?
2. Generator and iteragtor seen by a JS engine developer perspective. What does it imply in term for C++, performance consideration, how different is it from what exists already...
3. What does it means to implement new features in V8 (question driven)
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...InfluxData
Giraffe is the open source React-based visualization library that powers data visualizations in the InfluxDB 2.0 UI. Giraffe can be used to display your data within your own app and is Fluxlang-supported! It uses algorithms to handle visualizing high volumes of time series data that InfluxDB can ingest and query.
Kristina Robinson, the engineering manager for the Giraffe team at InfluxData, will dive into:
The basics of using the Giraffe library including how to query your data with Flux
Specific Giraffe visualization types for dashboards (e.g. single number, table and graph)
How to incorporate visualizations in your own custom apps
You use InfluxData to monitor the performance of your infrastructure and apps—so it is equally important to keep your InfluxEnterprise instance up and running. Tim Hall, InfluxData VP of Products, will outline why and how you can monitor InfluxEnterprise with InfluxDB.
Evgeniy Muralev, Mark Vince, Working with the compiler, not against itSergey Platonov
The talk will look at limitations of compilers when creating fast code and how to make more effective use of both the underlying micro-architecture of modern CPU's and how algorithmic optimizations may have surprising effects on the generated code. We shall discuss several specific CPU architecture features and their pros and cons in relation to creating fast C++ code. We then expand with several algorithmic techniques, not usually well-documented, for making faster, compiler friendly, C++.
Note that we shall not discuss caching and related issues here as they are well documented elsewhere.
C++ open positions and popularity remain high as media has recently, and there is a reason for that: from the many languages and platforms that developers have available today, C++ features uncontested capabilities in power and performance, allowing innovation outside the box (just think on action games, natural user interfaces or augmented reality, to mention some). In this talk you’ll see the new features and technologies that are coming with Visual C++ vNext, helping you build compelling applications with a renewed developer experience. Don’t miss it!!
(Bill Bejeck, Confluent) Kafka Summit SF 2018
Apache Kafka added a powerful stream processing library in mid-2016, Kafka Streams, which runs on top of Apache Kafka. The community has embraced Kafka Streams with many early adopters, and the adoption rate continues to grow. Large to mid-size organizations have come to rely on Kafka Streams in their production environments. Kafka Streams has many advanced features to make applications more robust.
The point of this presentation is to show users of Kafka Streams some of the latest and greatest features, as well as some that may be advanced, that can make streams applications more resilient. The target audience for this talk are those users already comfortable writing Kafka Streams applications and want to go from writing their first proof-of-concept applications to writing robust applications that can withstand the rigor that running in a production environment demands.
The talk will be a technical deep dive covering topics like:
-Best practices on configuring a Kafka Streams application
-How to meet production SLAs by minimizing failover and recovery times: configuring standby tasks and the pros and cons of having standby replicas for local state
-How to improve resiliency and 24×7 operability: the use of different configurable error handlers, callbacks and how they can be used to see what’s going on inside the application
-How to achieve efficient scalability: a thorough review of the relationship between the number of instances, threads and state stores and how they relate to each other
While this is a technical deep dive, the talk will also present sample code so that attendees can view the concepts discussed in practice. Attendees of this talk will walk away with a deeper understanding of how Kafka Streams works, and how to make their Kafka Streams applications more robust and efficient. There will be a mix of discussion.
Presentation given on Sunday, February 4th, 2018 in the containers devroom at FOSDEM 2018. This presentation covers the containerd project background, history, architecture, and current status as a CNCF project used by Docker, Kubernetes, and other projects requiring a stable, performant core container runtime.
In the slide, i describe the basis of python programming and their function. If any doubt in the slide, contact me through mail or linked in. My mail id is mdsathees@gmail.com
Presentation with a brief history of C, C++ and their ancestors along with an introduction to latest version C++11 and futures such as C++17. The presentation covers applications that use C++, C++11 compilers such as LLVM/Clang, some of the new language features in C++11 and C++17 and examples of modern idioms such as the new form compressions, initializer lists, lambdas, compile time type identification, improved memory management and improved standard library (threads, math, random, chrono, etc). (less == more) || (more == more)
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
30. dplyr
•
Package by Hadley Whickham
•
Plyr specialised for data frames: faster & with
remote data stores
•
Great design and syntax
•
Great performance thanks to C++
31. arrange
ex: Arrange by year within each player
arrange(Batting,
playerID, yearID)
Unit: milliseconds
expr
min
lq
df 186.64016 188.48495
dt 349.25496 352.12806
cpp 12.20485 13.85538
base 181.68259 182.58014
dt_raw 166.94213 170.15704
median
190.8989
357.4358
14.0081
184.6904
170.6418
uq
192.42140
403.45465
16.72979
186.33794
220.89911
max neval
195.36592
10
405.30055
10
23.95173
10
189.70377
10
223.42155
10
32. filter
Find the year for which each player played the most games
filter(Batting, G == max(G))
Unit: milliseconds
expr
min
lq
median
uq
max neval
df 371.96066 375.98652 380.92300 389.78870 430.2898
10
dt 47.37897 49.39681 51.23722 52.79181 95.8757
10
cpp 34.63382 35.27462 36.48151 38.30672 106.2422
10
base 141.81983 144.87670 147.36940 148.67299 173.8763
10
33. summarise
Compute the average number of at bats for each player
summarise(x, ab = mean(AB))
Unit: microseconds
expr
min
lq
median
uq
max neval
df 470726.569 475168.481 495500.076 498223.152 502601.494
10
dt 23002.422 23923.691 25888.191 28517.318 28683.864
10
cpp
756.265
820.921
838.529
864.624
950.079
10
base 253189.624 259167.496 263124.650 273097.845 326663.243
10
dt_raw 22462.560 23469.528 24438.422 25718.549 28385.158
10
34. Vector Visitor
Traversing an R vector of any type with the same interface
class VectorVisitor {
public:
virtual ~VectorVisitor(){}
/** hash the element of the visited vector at index i */
virtual size_t hash(int i) const = 0 ;
/** are the elements at indices i and j equal */
virtual bool equal(int i, int j) const = 0 ;
!
/** creates a new vector, of the same type as the visited vector, by
* copying elements at the given indices
*/
virtual SEXP subset( const Rcpp::IntegerVector& index ) const = 0 ;
!
}
35. Vector Visitor
inline VectorVisitor* visitor( SEXP vec ){
switch( TYPEOF(vec) ){
case INTSXP:
if( Rf_inherits(vec, "factor" ))
return new FactorVisitor( vec ) ;
return new VectorVisitorImpl<INTSXP>( vec ) ;
case REALSXP:
if( Rf_inherits( vec, "Date" ) )
return new DateVisitor( vec ) ;
if( Rf_inherits( vec, "POSIXct" ) )
return new POSIXctVisitor( vec ) ;
return new VectorVisitorImpl<REALSXP>( vec ) ;
case LGLSXP: return new VectorVisitorImpl<LGLSXP>( vec ) ;
case STRSXP: return new VectorVisitorImpl<STRSXP>( vec ) ;
default: break ;
}
// should not happen
return 0 ;
}
36. Chunked evaluation
ir <- group_by( iris, Species)
summarise(ir,
Sepal.Length = mean(Sepal.Length)
)
•
R expression to evaluate: mean(Sepal.Length)
•
Sepal.Length
•
dplyr knows mean.
•
fast and memory efficient algorithm
∊
iris
37. Hybrid evaluation
myfun <- function(x) x+x
ir <- group_by( iris, Species)
summarise(ir,
xxx = mean(Sepal.Length) + min(Sepal.Width) - myfun(Sepal.Length)
)
#1: fast evaluation of mean(Sepal.Length).
5.006 + min(Sepal.Width) - myfun(Sepal.Length)
#2: fast evaluation of min(Sepal.Width).
5.006 + 3.428 - myfun(Sepal.Length)
#3: fast evaluation of 5.006 + 3.428.
8.434 - myfun(Sepal.Length)
#4: R evaluation of 8.434 - myfun(Sepal.Length).
38. Hybrid Evaluation
!
•
mean, min, max, sum, sd, var, n, +, -, /, *, <, >,
<=, >=, &&, ||
•
packages can register their own hybrid
evaluation handler.
•
See hybrid-evaluation vignette
41. C++11 :
Lambda: function defined where used. Similar to apply
functions in R.
// [[Rcpp::export]]
NumericVector foo( NumericVector v){
NumericVector res = sapply( v,
[](double x){ return x*x; }
) ;
return res ;
}
42. C++11 : for each loop
C++98, C++03
std::vector<double> v ;
for( int i=0; i<v.size(); v++){
double d = v[i] ;
// do something with d
}
C++11
for( double d: v){
// do stuff with d
}
43. C++11 : init list
C++98, C++03
NumericVector x = NumericVector::create( 1, 2 ) ;
C++11
NumericVector x = {1, 2} ;
44. Other changes
•
Move semantics : used under the hood in
Rcpp11. Using less memory.
•
Less code bloat. Variadic templates
45. Rcpp11 article
•
I’m writing an article about C++11
•
Explain the merits of C++11
•
What’s next: C++14, C++17
•
Goal is to make C++11 welcome on CRAN
•
https://github.com/romainfrancois/cpp11_article