R is used in a vast ways. From pure ad-hoc by hobbysts to an organized and structured way in an enterprise. Each way of R usage brings different reproducibility challenges. Going through range of typical workflows we will show that understanding reproducibility must start with understanding your workflow. Presenting workflows we will show how we deal reproducibiilty challenges with open-source R Suite (http://rsuite.io) solution developed by us to support our large scale R development.
Managing large (and small) R based solutions with R SuiteWit Jakuczun
The presentation I gave at DataMass Gdańsk Summit in 2017:
R is a great tool for data scientist. Being very dynamic and popular is now one of the most important technology on the market. Unfortunately out-of-the-box R is not suited for large scale applications. I will present R Suite that is an open-source solution developed by us for us to manage R development process.
Case Studies in advanced analytics with RWit Jakuczun
A talk I gave at SQLDay 2017:
About 1,5 years ago Microsoft finalised acquisition of Revolution Analytics – a provider of software and services for R. In my opinion this was one of the most important event for R community. Now it is crucial to present its capabilities to SQL Server community. It will be beneficial for both parties. I will present three case studies: cash optimisation in Deutsche Bank, midterm model for energy prices forecasting, workforce demand optimising. The case studies were implemented with our analytical workflow R Suite that will be also shortly presented.
Presentation of the paper "Primers or Reminders? The Effects of Existing Review Comments on Code Review" published at ICSE 2020.
Authors:
Davide Spadini, Gül Calikli, Alberto Bacchelli
Link to the paper: https://research.tudelft.nl/en/publications/primers-or-reminders-the-effects-of-existing-review-comments-on-c
Processing malaria HTS results using KNIME: a tutorialGreg Landrum
Walks through a couple of KNIME Workflows for working with HTS Data.
The workflows are derived from the work described in this publication: https://f1000research.com/articles/6-1136/v2
The talk I gave at the Stream Reasoning workshop in TU Berlin on December 8. I give an overview of RSEP-QL and how it can capture and formalise the behaviour of existing RSP engines, e.g. CSPARQL, EP-SPARQL, CQELS, SPARQLstream
Managing large (and small) R based solutions with R SuiteWit Jakuczun
The presentation I gave at DataMass Gdańsk Summit in 2017:
R is a great tool for data scientist. Being very dynamic and popular is now one of the most important technology on the market. Unfortunately out-of-the-box R is not suited for large scale applications. I will present R Suite that is an open-source solution developed by us for us to manage R development process.
Case Studies in advanced analytics with RWit Jakuczun
A talk I gave at SQLDay 2017:
About 1,5 years ago Microsoft finalised acquisition of Revolution Analytics – a provider of software and services for R. In my opinion this was one of the most important event for R community. Now it is crucial to present its capabilities to SQL Server community. It will be beneficial for both parties. I will present three case studies: cash optimisation in Deutsche Bank, midterm model for energy prices forecasting, workforce demand optimising. The case studies were implemented with our analytical workflow R Suite that will be also shortly presented.
Presentation of the paper "Primers or Reminders? The Effects of Existing Review Comments on Code Review" published at ICSE 2020.
Authors:
Davide Spadini, Gül Calikli, Alberto Bacchelli
Link to the paper: https://research.tudelft.nl/en/publications/primers-or-reminders-the-effects-of-existing-review-comments-on-c
Processing malaria HTS results using KNIME: a tutorialGreg Landrum
Walks through a couple of KNIME Workflows for working with HTS Data.
The workflows are derived from the work described in this publication: https://f1000research.com/articles/6-1136/v2
The talk I gave at the Stream Reasoning workshop in TU Berlin on December 8. I give an overview of RSEP-QL and how it can capture and formalise the behaviour of existing RSP engines, e.g. CSPARQL, EP-SPARQL, CQELS, SPARQLstream
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers a complete schedule of upcoming events, using OpenACC for a biophysics problem, HPC Summit Digital, overview of the SDSC GPU Hackathon, OmpSs-2 programming model, new resources and more!
Learn about the accomplishments and activities of the OpenACC organization over the course of 2019. This OpenACC Highlights covers the newest additions to the OpenACC leadership, the updated specification, conference participation, GPU Hackathons and more.
Stay up-to-date with the OpenACC Monthly Highlights. June's edition covers the OpenACC Summit 2021, NVIDIA GTC'21 on-demand sessions, upcoming GPU Hackathons and Bootcamps, Intersect360 Research HPC market forecast, recent research, new resources and more!
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers a Mentor Spotlight on Matthew Norman from ORNL, the first GPU Hackathon of the 2021 season, GTC21, Clacc, upcoming GPU Hackathons and Bootcamps, and new resources!
From weather and climate to seismic imaging to aeronautics, OpenACC sessions featured at GTC20 are helping to facilitate discussions, educate attendees and encourage networking and collaboration.
Sessions cover a broad range of topics, the “Meet the Experts” session enabled one-on-one deep dives into using OpenACC to solve specific challenges, posters highlight how OpenACC is being applied to current science applications, and the on-demand tutorial delivers hands-on skills building.
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers the upcoming NVIDIA GTC 2019, complete schedule of GPU hackathons and more!
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers the most recent 2019 GPU Hackathons, a complete schedule of upcoming events, new resources and more!
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...Revolution Analytics
[Presentation by Skylar Lyon at DataWeek 2014, September 17 2014.]
I recently faced the task of how to scale out an existing analytics process. The schedule was compressed - it always is in my world. The data was big - 400+ million rows waiting in database. What did I do? I offered my favorite type of solution - quick and dirty.
At the outset, I wasn't sure how easy it would be. Nor was I certain of realized performance gains. But the concept seemed sound and the exercise fun. Let's move the compute to the data via Revolution R Enterprise for Teradata.
This presentation outlines my approach in leveraging a colleague's R models as I experimented with running R in-database. Would my path lead to significant improvement? Could it be used to productionalize the workflow?
Raster Algebra mit Oracle Spatial und uDigKarin Patenge
Im Foliensatz ist die Integration von Oracle Spatial mit Open Source Technolgien beschrieben. Am Beispiel von uDig wird Schritt-für-Schritt aufgezeigt, wie es zusammen mit Oracle Spatial für die Rasterdatenanalyse eingesetzt werden hier. Beispielhaft wird ein Vegetationsindex (NVDI) berechnet.
Bei Interesse gern auch weiterlesen auf dem Oracle Spatial Blog (http://oracle-spatial.blogspot.com).
Beacon v2 Reference Implementation: An OverviewCINECAProject
The Beacon v2 Reference Implementation (B2RI) is a free open source Linux-based software created by the Centre for Genomic Regulation (Barcelona, Spain) that allows lighting up a Beacon v2 out-of-the-box.In this training session, a B2RI developer will give an overview on how to use the software to “beaconize” your data (from the user’s perspective).
At the end of this training session, participants will be familiarized with the input and output requirements of the B2RI, as well as with the type of queries allowed.
This training session was delivered on 17 February 2022 as part of the CINECA GA4GH Beacon series.
You can learn about the CINECA project on https://www.cineca-project.eu/
Context: Recent projects such as L4.verified (the verification of
the seL4 microkernel) have demonstrated that large-scale formal program verification is now becoming practical.
Objective: We address an important but unstudied aspect of proof engineering: proof productivity.
Method: We extracted size and effort data from the history of the development of nine projects associated with L4.verified.
Results: We find strong linear relationships between effort and proof size for projects and for individuals. We discuss opportunities and limitations with the use of lines of proof as a size measure, and discuss the importance of understanding proof productivity for future research.
Conclusions: An understanding of proof productivity will assist in its further industrial application and provide a basis for cost estimation and understanding of rework and tool usage.
How to lock a Python in a cage? Managing Python environment inside an R projectWLOG Solutions
Presentation from a workshop delivered by Piotr Chaberski during PyData Warsaw Meetup on Feb. 06, 2018.
Imagine that you are developing a project using R and your big corporate customer, after weeks of processing requests to establish open-source analytical environment, finally managed to install R on their production machines. Now you realized, that it would be nice to use some Python library in your solution...
How would you tell the client to switch to Python for a while?
Managing large scale projects in R with R SuiteWLOG Solutions
Presentation from a workshop delivered by Piotr Chaberski during PyData Warsaw on Oct. 18, 2017.
Description
Machine Learning is not only about algorithms. Machine learning is about value and this can be achieved only after proper deployment of Machine Learning solutions. I will present best practices regarding managing R based ML projects.
Abstract
Agenda:
I will use our open-source tool R Suite (http://rsuite.io/). During the workshop I will talk about:
project structure
development cycle
repository management
deployment
During the workshop you will learn about our best practices (e.g. loggers, version control, etc.) we have developed for 12 years of using R.
Requirements:
Basic R knowledge.
Basic ML/DS knowledge.
Software installed:
R in version 4.3.2
R Suite (latest)
R Studio Desktop
Operating system:
I will be using Windows 10 and this is recommended but Linux should also work.
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers a complete schedule of upcoming events, using OpenACC for a biophysics problem, HPC Summit Digital, overview of the SDSC GPU Hackathon, OmpSs-2 programming model, new resources and more!
Learn about the accomplishments and activities of the OpenACC organization over the course of 2019. This OpenACC Highlights covers the newest additions to the OpenACC leadership, the updated specification, conference participation, GPU Hackathons and more.
Stay up-to-date with the OpenACC Monthly Highlights. June's edition covers the OpenACC Summit 2021, NVIDIA GTC'21 on-demand sessions, upcoming GPU Hackathons and Bootcamps, Intersect360 Research HPC market forecast, recent research, new resources and more!
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers a Mentor Spotlight on Matthew Norman from ORNL, the first GPU Hackathon of the 2021 season, GTC21, Clacc, upcoming GPU Hackathons and Bootcamps, and new resources!
From weather and climate to seismic imaging to aeronautics, OpenACC sessions featured at GTC20 are helping to facilitate discussions, educate attendees and encourage networking and collaboration.
Sessions cover a broad range of topics, the “Meet the Experts” session enabled one-on-one deep dives into using OpenACC to solve specific challenges, posters highlight how OpenACC is being applied to current science applications, and the on-demand tutorial delivers hands-on skills building.
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers the upcoming NVIDIA GTC 2019, complete schedule of GPU hackathons and more!
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers the most recent 2019 GPU Hackathons, a complete schedule of upcoming events, new resources and more!
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...Revolution Analytics
[Presentation by Skylar Lyon at DataWeek 2014, September 17 2014.]
I recently faced the task of how to scale out an existing analytics process. The schedule was compressed - it always is in my world. The data was big - 400+ million rows waiting in database. What did I do? I offered my favorite type of solution - quick and dirty.
At the outset, I wasn't sure how easy it would be. Nor was I certain of realized performance gains. But the concept seemed sound and the exercise fun. Let's move the compute to the data via Revolution R Enterprise for Teradata.
This presentation outlines my approach in leveraging a colleague's R models as I experimented with running R in-database. Would my path lead to significant improvement? Could it be used to productionalize the workflow?
Raster Algebra mit Oracle Spatial und uDigKarin Patenge
Im Foliensatz ist die Integration von Oracle Spatial mit Open Source Technolgien beschrieben. Am Beispiel von uDig wird Schritt-für-Schritt aufgezeigt, wie es zusammen mit Oracle Spatial für die Rasterdatenanalyse eingesetzt werden hier. Beispielhaft wird ein Vegetationsindex (NVDI) berechnet.
Bei Interesse gern auch weiterlesen auf dem Oracle Spatial Blog (http://oracle-spatial.blogspot.com).
Beacon v2 Reference Implementation: An OverviewCINECAProject
The Beacon v2 Reference Implementation (B2RI) is a free open source Linux-based software created by the Centre for Genomic Regulation (Barcelona, Spain) that allows lighting up a Beacon v2 out-of-the-box.In this training session, a B2RI developer will give an overview on how to use the software to “beaconize” your data (from the user’s perspective).
At the end of this training session, participants will be familiarized with the input and output requirements of the B2RI, as well as with the type of queries allowed.
This training session was delivered on 17 February 2022 as part of the CINECA GA4GH Beacon series.
You can learn about the CINECA project on https://www.cineca-project.eu/
Context: Recent projects such as L4.verified (the verification of
the seL4 microkernel) have demonstrated that large-scale formal program verification is now becoming practical.
Objective: We address an important but unstudied aspect of proof engineering: proof productivity.
Method: We extracted size and effort data from the history of the development of nine projects associated with L4.verified.
Results: We find strong linear relationships between effort and proof size for projects and for individuals. We discuss opportunities and limitations with the use of lines of proof as a size measure, and discuss the importance of understanding proof productivity for future research.
Conclusions: An understanding of proof productivity will assist in its further industrial application and provide a basis for cost estimation and understanding of rework and tool usage.
How to lock a Python in a cage? Managing Python environment inside an R projectWLOG Solutions
Presentation from a workshop delivered by Piotr Chaberski during PyData Warsaw Meetup on Feb. 06, 2018.
Imagine that you are developing a project using R and your big corporate customer, after weeks of processing requests to establish open-source analytical environment, finally managed to install R on their production machines. Now you realized, that it would be nice to use some Python library in your solution...
How would you tell the client to switch to Python for a while?
Managing large scale projects in R with R SuiteWLOG Solutions
Presentation from a workshop delivered by Piotr Chaberski during PyData Warsaw on Oct. 18, 2017.
Description
Machine Learning is not only about algorithms. Machine learning is about value and this can be achieved only after proper deployment of Machine Learning solutions. I will present best practices regarding managing R based ML projects.
Abstract
Agenda:
I will use our open-source tool R Suite (http://rsuite.io/). During the workshop I will talk about:
project structure
development cycle
repository management
deployment
During the workshop you will learn about our best practices (e.g. loggers, version control, etc.) we have developed for 12 years of using R.
Requirements:
Basic R knowledge.
Basic ML/DS knowledge.
Software installed:
R in version 4.3.2
R Suite (latest)
R Studio Desktop
Operating system:
I will be using Windows 10 and this is recommended but Linux should also work.
WebRTC Live Q&A Session #5 - JavaScript Promises and WebRTC Interoperability ...Amir Zmora
WebRTC training about JavaScript promises and an update about WebRTC interoperability, API compatibility and IMTC tests. Part of the monthly WebRTC live Q&A sessions by Alex Gouailard, Dan Burnett and Amir Zmora
(Slides on JavaScript promises carry a specific Copyright as detailed on slides themselves)
A guide to make crashproof libraries
A tips and tricks presentation for Poznań Android Developer Group.
http://www.meetup.com/Poznan-Android-Developer-Group/events/228107133/
Apresentação do meetup "[JOI] TOTVS Developers Joinville - Java #1" que ocorreu dia 07/08/2019.
** Novidades Java, GraalVM e Quarkus
** Do zero à nuvem com Java e Kubernetes
Graal and Truffle: Modularity and Separation of Concerns as Cornerstones for ...Thomas Wuerthinger
Multi-language runtimes providing simultaneously high performance for several programming languages still remain an illusion. Industrial-strength managed language runtimes are built with a focus on one language (e.g., Java or C#). Other languages may compile to the bytecode formats of those managed language runtimes. However, the performance characteristics of the bytecode generation approach are often lagging behind compared to language runtimes specialized for a specific language. The performance of JavaScript is for example still orders of magnitude better on specialized runtimes (e.g., V8 or SpiderMonkey).
We present a solution to this problem by providing guest languages with a new way of interfacing with the host runtime. The semantics of the guest language is communicated to the host runtime not via generating bytecodes, but via an interpreter written in the host language. This gives guest languages a simple way to express the semantics of their operations including language-specific mechanisms for collecting profiling feedback. The efficient machine code is derived from the interpreter via automatic partial evaluation. The main components reused from the underlying runtime are the compiler and the garbage collector. They are both agnostic to the executed guest languages.
The host compiler derives the optimized machine code for hot parts of the guest language application via partial evaluation of the guest language interpreter. The interpreter definition can guide the host compiler to generate deoptimization points, i.e., exits from the compiled code. This allows guest language operations to use speculations: An operation could for example speculate that the type of an incoming parameter is constant. Furthermore, the guest language interpreter can use global assumptions about the system state that are registered with the compiled code. Finally, part of the interpreter's code can be excluded from the partial evaluation and remain shared across the system. This is useful for avoiding code explosion and appropriate for infrequently executed paths of an operation. These basic mechanisms are provided by the underlying language-agnostic host runtime and allow separation of concerns between guest and host runtime.
We implemented Truffle, the guest language runtime framework, on top of the Graal compiler and the HotSpot virtual machine. So far, there are prototypes for C, J, Python, JavaScript, R, Ruby, and Smalltalk running on top of the Truffle framework. The prototypes are still incomplete with respect to language semantics. However, most of them can run non-trivial benchmarks to demonstrate the core promise of the Truffle system: Multiple languages within one runtime system at competitive performance.
Similar to Know your R usage workflow to handle reproducibility challenges (20)
Presentation delivered during Data Science Rzeszow meetup:
I will present reasons for optimization being superior to predictive algorithms in data science practical applications. I will cover exemplary case studies, tools and hints from my experience on delivering hybrid solutions that exploit both prediction and optimization
Always Be Deploying. How to make R great for machine learning in (not only) E...Wit Jakuczun
The presentation I delivered at WhyR 2019.
Abstract:
For many years software engineers have put enormous effort to develop best practices to deliver stable and maintainable software. How R users can benefit from this experience? I will try to answer this question going through several concepts and tools that are natural for software engineers but are often undervalued by R users.
I will start with a description of the deployment process because this is the ultimate step that exposes all weaknesses. You will learn about structuring R project, using abstractions to manage model’s features, automating models building process, optimizing the performance of the solution and the challenges of the deployment process itself.
Driving your marketing automation with multi-armed bandits in real timeWit Jakuczun
Presentation delivered at Big Data Tech Warsaw 2019 by me and Maciej Próchniak from TouK.
Multiarmed bandits vs simple A/B testing. Architecture of solution – how to connect Flink, Nussknacker and R? Other uses cases – what are other good fits for similar architecture.
We observe that many of our customers are actively adapting various marketing automation solutions. While most of them offer some basic A/B testing modules they are often too simple for highly dynamic conditions. Better outcomes can be achieved using e.g. multiarmed bandits algorithms, however, it’s not so straightforward to deploy them in a realtime production environment.
In our presentation, we will use a platform based on Apache Flink, Nussknacker – our custom GUI and R Studio + R Suite – everything deployed on Kubernetes.The main goal of our talk is to show how using proposed tools we can create complete flow – from model creation, through deployment and reinforcement learning – that helps to automate marketing communication without the need for custom code development.
The talk is partially based on our former deployments of similar solutions, many ideas are new, however.
Large scale machine learning projects with r suiteWit Jakuczun
Agenda for the workshop I conducted at ML@Enterprise conference that took place on 14th of December 2017 in Warsaw.
Machine Learning is not only about algorithms. Machine learning is about value and this can be achieved only after proper deployment of Machine Learning solutions. I will present best practices regarding managing R based ML projects. I will use our open-source tool R Suite (http://rsuite.io/). During the workshop I will talk about:
– project structure
– development cycle
– deployment
– test
20170928 why r_r jako główna platforma do zaawansowanej analityki w enterpriseWit Jakuczun
Presentation (in polish) I gave at WhyR conference in Warsaw. The abstract:
The world of hermetic analytical platforms is slowly becoming history. Today, advanced analytics is being pushed forward by the open-source world supported by the biggest players. In various discussions R's maturity is being questioned if Enterprise point of view is considered. Based on the R deployment in large telecom, I will tell why I claim R can be number one in advanced analytics in any large corporation. I will show what virtues and vices of migrating to R.
Prezentacja z Data Science Summit 2017:
R spowodował wywrócenie świata analityki. Widzą to duzi gracze jak np. Microsoft czy Oracle. Ale powstaje pytanie jak nowoczesność i zmienność R przełożyć na wartość w stabilnym świecie Enterprise? Ile to kosztuje czasu i pieniędzy? I jak to zrobić bezpiecznie? Odpowiem na te pytania na podstawie wdrażania R w dużym telekomie.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
5. Copyright (c) WLOG Solutions
John
Could not deliver R labs homework due to
package incompatibility at professors
laptop.
6. Copyright (c) WLOG Solutions
Kate and Henry
Missed deadlines due to problems
installing packages for their R shiny app at
Customer’s Server running
RedHat Enterprise 6.8.
7. Copyright (c) WLOG Solutions
The Team
Had serious issues with package versions
conflicts due to many users, many
projects,
running RedHat Enteprise machine
without internet access.
8. Copyright (c) WLOG Solutions
Three different stories
the same
reproducibility
problem.
10. Copyright (c) WLOG Solutions
Reproducibility is the
ability to run your code repeatedly,
at different time,
using different computer,
in such way to
obtain the same outputs given the
same inputs.
11. Copyright (c) WLOG Solutions
Reproducibility is the
ability to run a code repeatedly,
at different time,
using different computer,
in such way to
obtain the same outputs given the
same inputs.
12. Copyright (c) WLOG Solutions
Reproducibility is the
ability to run your code repeatedly,
at different time,
using different computer,
in such way to
obtain the same outputs given the
same inputs.
13. Copyright (c) WLOG Solutions
Reproducibility is the
ability to run your code repeatedly,
at different time,
at different computer,
in such way to
obtain the same outputs given the
same inputs.
14. Copyright (c) WLOG Solutions
Reproducibility is the
ability to run your code repeatedly,
at different time,
using different computer,
in such way to
obtain the same outputs given the
same inputs.
15. Copyright (c) WLOG Solutions
Bare metal
Operating system
Solution dependencies
Code
Data
21. Copyright (c) WLOG Solutions
When is reproducibility
important while you
program in R?
22. Copyright (c) WLOG Solutions
Debian/Ubuntu
RedHat/Centos
Windows
Debian/Ubuntu
RedHat/Centos
Windows
Development Production
Deploy (share) solution to production
23. Copyright (c) WLOG Solutions
Debian/Ubuntu
RedHat/Centos
Windows
Debian/Ubuntu
RedHat/Centos
Windows
Development Development’
Restore development environment
24. Copyright (c) WLOG Solutions
Three workflows
three reproducibility
solutions.
25. Copyright (c) WLOG Solutions
John, student/hobbyist
Dev/Production
Version
controlFamily&Friends or
Professor
MRAN
26. Copyright (c) WLOG Solutions
Kate and Henry, consultancy
team/freelancer/scientist
DevProduction
Continuous
integration
Version
control
Local CRAN
MRAN
On-premise
Cloud
Spark
etc.
27. Copyright (c) WLOG Solutions
The Team, corporate/in-house team
DevProduction
Continuous
integration
Version
control
Local CRAN
28. Copyright (c) WLOG Solutions
One word on Docker
Development Production
Build for
different OS
Deployment
package
. zip
29. Copyright (c) WLOG Solutions
Second word on Docker
Development Production
Build
Docker
image
30. Copyright (c) WLOG Solutions
CRAN
management
Multiple R
versions
Debian/Ubuntu
Windows
RedHat/CenOS
Docker
Jenkins
Isolated
projects
http://rsuite.io
https://github.com/WLOGSolutions/RSuite
https://www.slideshare.net/WLOGSolutions
No installation
on prod
Internetless
environments
System
requirements
Git/SVN
Binary
packages