This document provides an overview of data visualization concepts and techniques using R. It begins with an agenda and introduces data visualization and its importance. Key concepts in data visualization like the grammar of graphics are explained. Visualization techniques using base graphics in R like bar plots and histograms are demonstrated. Visualization with ggplot2 is covered through examples using geoms like geom_bar(), geom_hist(), geom_point(), and geom_boxplot(). Faceting and theme layers are also explained. The document ends with sample quiz questions.
Digital analytics with R - Sydney Users of R Forum - May 2015Johann de Boer
This document discusses using the ganalytics R package to access and analyze Google Analytics data through R. It provides an overview of Google Analytics and its APIs, demonstrates how to build queries with ganalytics, extract and summarize data in R. It also discusses enhancing ganalytics by improving documentation, testing, adding features, and internationalization. The document encourages participation in open source development of the package.
Patterns and Practices for Event Design With Adam Bellemare | Current 2022HostedbyConfluent
Patterns and Practices for Event Design With Adam Bellemare | Current 2022
Events are the fundamental component of every streaming architecture, and how you implement them will hugely impact your event-driven architectures. Despite the wide range of materials on event-driven architectures and the importance of event modeling, this critical domain is often left as an exercise for you to implement on your own. Improperly modeling your events can have difficult and costly impacts on not only your event consumers but on the teams and systems that produce them as well.
In this talk, Adam covers the main considerations of modeling and implementing events. Data is often modeled as a Fact or a Delta, though the distinction isn't always clear.
For one, facts are commonly used in the event-carried state transfer pattern, while deltas are commonly used in event sourcing. But when communicating across domain boundaries, which ones should you choose? What are the tradeoffs, the benefits, and the best use-cases for each? Adam digs into these main event types, providing some examples and guidelines for when to use each.
Adam closes out the presentation with an opinionated list of best practices. Do you think naming is tricky? What about versioning? Evolving your data model got you down? Torn between multiple event types per stream and multiple streams per event? Adam's has a host of best practices, well-reasoned examples, and practical tips to help you model and implement your events and streams.
An Interactive Introduction To R (Programming Language For Statistics)Dataspora
This is an interactive introduction to R.
R is an open source language for statistical computing, data analysis, and graphical visualization.
While most commonly used within academia, in fields such as computational biology and applied statistics, it is gaining currency in industry as well – both Facebook and Google use R within their firms.
This document presents insights from analyzing visitor behavior and activities on a website using R. It finds that there are three categories of visitors - new users, returning users, and neither. Returning users have lower bounce rates than new users. There are also natural clusters in user activities that correspond to the different visitor categories. The analysis finds temporal patterns by month but not strong weekday/weekend patterns due to the dummy nature of the data. It also examines trends based on user platform and site visited, finding that some sites are more popular across platforms than others.
Deep Anomaly Detection from Research to Production Leveraging Spark and Tens...Databricks
This document discusses Swedbank's work on anomaly detection from research to production. Some key points:
- Swedbank's Analytics & AI team works on advanced analytics and AI research, delivering applications to address business needs in the short and long term.
- They describe their approach to building deep anomaly detection models using generative adversarial networks and deploying them using TensorFlow Serving.
- The team has developed an "Analytical Ops" framework to streamline the process from building models in a research environment to translating, packaging, and publishing them for production use.
- Lessons learned include the importance of joint data science and engineering efforts, using a feature store for reuse, and infrastructure to support hyperparameter
This document discusses storing product and order data as JSON in a database to support an agile development process. It describes creating tables with JSON columns to store this data, and using JSON functions like JSON_VALUE and JSON_TABLE to query and transform the JSON data. Examples are provided of indexing JSON columns for performance and updating product JSON to include unit costs by joining external data. The goal is to enable flexible and rapid evolution of the application through storing data in JSON.
InstructionsInstructions for the Microsoft Excel TemplatesThis wor.docxdirkrplav
This document provides instructions for completing Microsoft Excel templates for a grading assignment. It explains that only the provided workbook should be submitted and that it contains the assignment details. Students should enter their name and solve problems by filling in highlighted cells with account titles, values, formulas, and text explanations. The templates are formatted to print on standard paper sizes and use standard accounting formatting like commas. Students should leverage features like split panes and lookup formulas to assist with copying data accurately. Sample templates are provided for problems on income statements, retained earnings statements, and long-term construction contracts.
The document analyzes a superstore dataset containing sales data from 2015 to 2018 to understand shopping patterns and identify profitable products and regions. Visualizations show that the Western region had the most orders. Quantity ordered was highest for 1 item. Technology was the most profitable category at 36% of sales. The Tree Map showed phones had the highest sales while furniture had losses. Sales and profit were somewhat correlated, while profit and discount were negatively correlated. Word clouds indicated products related to Xerox, binders, chairs and Avery were ordered most frequently.
Digital analytics with R - Sydney Users of R Forum - May 2015Johann de Boer
This document discusses using the ganalytics R package to access and analyze Google Analytics data through R. It provides an overview of Google Analytics and its APIs, demonstrates how to build queries with ganalytics, extract and summarize data in R. It also discusses enhancing ganalytics by improving documentation, testing, adding features, and internationalization. The document encourages participation in open source development of the package.
Patterns and Practices for Event Design With Adam Bellemare | Current 2022HostedbyConfluent
Patterns and Practices for Event Design With Adam Bellemare | Current 2022
Events are the fundamental component of every streaming architecture, and how you implement them will hugely impact your event-driven architectures. Despite the wide range of materials on event-driven architectures and the importance of event modeling, this critical domain is often left as an exercise for you to implement on your own. Improperly modeling your events can have difficult and costly impacts on not only your event consumers but on the teams and systems that produce them as well.
In this talk, Adam covers the main considerations of modeling and implementing events. Data is often modeled as a Fact or a Delta, though the distinction isn't always clear.
For one, facts are commonly used in the event-carried state transfer pattern, while deltas are commonly used in event sourcing. But when communicating across domain boundaries, which ones should you choose? What are the tradeoffs, the benefits, and the best use-cases for each? Adam digs into these main event types, providing some examples and guidelines for when to use each.
Adam closes out the presentation with an opinionated list of best practices. Do you think naming is tricky? What about versioning? Evolving your data model got you down? Torn between multiple event types per stream and multiple streams per event? Adam's has a host of best practices, well-reasoned examples, and practical tips to help you model and implement your events and streams.
An Interactive Introduction To R (Programming Language For Statistics)Dataspora
This is an interactive introduction to R.
R is an open source language for statistical computing, data analysis, and graphical visualization.
While most commonly used within academia, in fields such as computational biology and applied statistics, it is gaining currency in industry as well – both Facebook and Google use R within their firms.
This document presents insights from analyzing visitor behavior and activities on a website using R. It finds that there are three categories of visitors - new users, returning users, and neither. Returning users have lower bounce rates than new users. There are also natural clusters in user activities that correspond to the different visitor categories. The analysis finds temporal patterns by month but not strong weekday/weekend patterns due to the dummy nature of the data. It also examines trends based on user platform and site visited, finding that some sites are more popular across platforms than others.
Deep Anomaly Detection from Research to Production Leveraging Spark and Tens...Databricks
This document discusses Swedbank's work on anomaly detection from research to production. Some key points:
- Swedbank's Analytics & AI team works on advanced analytics and AI research, delivering applications to address business needs in the short and long term.
- They describe their approach to building deep anomaly detection models using generative adversarial networks and deploying them using TensorFlow Serving.
- The team has developed an "Analytical Ops" framework to streamline the process from building models in a research environment to translating, packaging, and publishing them for production use.
- Lessons learned include the importance of joint data science and engineering efforts, using a feature store for reuse, and infrastructure to support hyperparameter
This document discusses storing product and order data as JSON in a database to support an agile development process. It describes creating tables with JSON columns to store this data, and using JSON functions like JSON_VALUE and JSON_TABLE to query and transform the JSON data. Examples are provided of indexing JSON columns for performance and updating product JSON to include unit costs by joining external data. The goal is to enable flexible and rapid evolution of the application through storing data in JSON.
InstructionsInstructions for the Microsoft Excel TemplatesThis wor.docxdirkrplav
This document provides instructions for completing Microsoft Excel templates for a grading assignment. It explains that only the provided workbook should be submitted and that it contains the assignment details. Students should enter their name and solve problems by filling in highlighted cells with account titles, values, formulas, and text explanations. The templates are formatted to print on standard paper sizes and use standard accounting formatting like commas. Students should leverage features like split panes and lookup formulas to assist with copying data accurately. Sample templates are provided for problems on income statements, retained earnings statements, and long-term construction contracts.
The document analyzes a superstore dataset containing sales data from 2015 to 2018 to understand shopping patterns and identify profitable products and regions. Visualizations show that the Western region had the most orders. Quantity ordered was highest for 1 item. Technology was the most profitable category at 36% of sales. The Tree Map showed phones had the highest sales while furniture had losses. Sales and profit were somewhat correlated, while profit and discount were negatively correlated. Word clouds indicated products related to Xerox, binders, chairs and Avery were ordered most frequently.
The document provides an introduction to machine learning, including:
- An overview of machine learning and its uses like fraud detection, personalization, and predictive maintenance.
- The basics of machine learning including the goal of finding the best model to predict outcomes, and common algorithms like linear regression, logistic regression, and k-means clustering.
- Demo examples of linear and logistic regression using scikit-learn in Python.
- Quotes about machine learning from Jeff Bezos and its potential.
The Sum of our Parts: the Complete CARTO Journey [CARTO]CARTO
In this webinar, we put all of the pieces together - showing, using the example of the Real Estate market in Los Angeles, how the CARTO platform powers a data-to-decision workflow, showcasing every step along the way.
Watch it now at: https://go.carto.com/sum-parts-complete-carto-journey-webinar-recorded
Real-Time Personalized Customer Experiences at Bonobos (RET203) - AWS re:Inve...Amazon Web Services
In this session, learn how Bonobos, an online retailer for men's clothing and accessories, powers their personalized customer experiences on top of AWS. We start by exploring the foundational elements required to build an effective retail data platform as well as the building blocks provided by AWS to deliver these experiences. Learn how Bonobos leverages Segment in their architecture, and hear from Bonobos and Segment on the objectives, challenges, and outcomes realized by Bonobos through their journey in constructing and deploying their personalization platform.
RStudio uses data from its products and customer interactions to improve user experience and product quality. Data is collected from tools in RStudio's data science toolchain including exploration, analysis, modeling, visualization and communication packages. This data is cleaned, transformed and visualized to understand user needs. Insights inform improvements to RStudio products, documentation and support processes. The goal is to enhance reproducibility, scalability and measurability of improvements through a data-driven approach.
Customer Clustering for Retailer MarketingJonathan Sedar
This was a 90 min talk given to the Dublin R user group in Nov 2013. It describes how one might go about a data analysis project using the R language and packages, using an open source dataset.
Implementing a data_science_project (Python Version)_part1Dr Sulaimon Afolabi
This document outlines the steps for loading and merging crime data from multiple sources for machine learning analysis. It begins by introducing the datasets: Dataset A contains crime reports and police stations; Dataset B contains police station populations; and Dataset C contains police station coordinates. It then describes loading each dataset and checking for duplicates. The datasets are merged by joining Dataset A and B on the police station feature, and joining the resulting dataset with Dataset C on police station. The final merged dataset will contain crime statistics, population estimates, and location data for analysis.
CARTO en 5 Pasos: del Dato a la Toma de Decisiones [CARTO]CARTO
En este webinar repasamos - mediante una demostración con el mercado de Real Estate de Los Angeles como ejemplo - cada uno de los cinco pasos que la plataforma de CARTO sigue para una toma de decisiones eficaz basada en los datos.
Watch it now at: https://go.carto.com/carto-pasos-dato-toma-decisiones-recorded
Google Analytics for Beginners - TrainingRuben Vezzoli
I used this presentation for an internal training about Google Analytics and Web Analytics.
Google Analytics Training for Beginners.
Google Analytics Tutorial
Google Analytics for Dummies
Google Analytics Guide
The document describes using predictive analytics to help increase sales by 30% for an online pet store. It performs principal component analysis on the store's 2012-2014 data to identify key variables accounting for 99% of variability. It builds a linear regression model predicting units sold based on these variables. A sensitivity analysis varies the variables to identify combinations that could achieve the 30% sales goal.
Most today's software is highly static, even if it is written in a dynamic language like Smalltalk. Developers are not encouraged to extend the frameworks they are using; and end-users are unable to change the features of their software without initiating a new development effort. In contrast, extensible software is designed for change; and customizable software can be adapted to new needs without requiring an in-depth knowledge of the underlying implementation domain.
In this presentation I will investigate on how to write truly dynamic software and I will distill common patterns of software customizability. As running examples I present tools that I worked on during my path of discovering Smalltalk. One of these examples is Magritte, a dynamic meta-model that gives end-users the possibility to customize their applications without the need of an additional development effort. Another example is Helvetia, an infrastructure enabling on-the-fly customization of the programming language and development environment.
1. The document discusses the power of data visualization and different charting libraries for JavaScript including D3.js, Chart.js, Dygraphs, and Google Charts.
2. It provides information on each library's learning curve, lines of code needed for basic charts, level of customization, interactivity, and best uses.
3. The recap emphasizes defining the chart purpose, choosing a chart type based on data, designing the chart, selecting a library, and iterating the process to effectively communicate with data visualization.
GraphTour - How to Build Next-Generation Solutions using Graph DatabasesNeo4j
The document provides an agenda and overview of a presentation about graph-based solutions using Neo4j for telecom services. The presentation covers topics like service assurance in telecoms using Neo4j to model dependencies, performing impact and root cause analysis, and using graphs for governance, metadata management and GDPR compliance through concepts like data modeling, transformation, and entitlement management. The conclusions emphasize that graphs are well-suited for these domains and that the combination of domain expertise with Neo4j's graph capabilities provides powerful solutions.
The document discusses various methods of visualizing data through charts and graphs. It covers different formats for storing and accessing data (such as CSV, JSON, XML), engines for generating visualizations (Google Chart Tools, Google Fusion Tables), and techniques for embedding visualizations (canvas tag, scripted charts, templated visualizations). It also addresses issues around separating data from visualizations and where the data is located in the visualization process.
Data Science. .Net/C# Monte Carlo modeling. The R Programming language. See it all come together in one place in this talk. Presentation date 6/13 at Lake County .NET User Group.
45min talk given at LondonR March 2014 Meetup.
The presentation describes how one might go about an insights-driven data science project using the R language and packages, using an open source dataset.
The document discusses data warehouses and multi-dimensional data analysis. It provides an example of modeling sales data across multiple tables in a relational database and shows how long queries take on large amounts of data. It then introduces dimensional modeling and how to structure data in a data warehouse using fact and dimension tables. Finally, it demonstrates how to model the sales data multidimensionally using a data cube and discusses OLAP technologies like Mondrian for multi-dimensional analysis.
Amazon Machine Learning is a service that makes it easy for developers of all skill levels to use machine learning technology. Amazon Machine Learning provides visualization tools and wizards that guide you through the process of creating machine learning (ML) models without having to learn complex ML algorithms and technology. Once your models are ready, Amazon Machine Learning makes it easy to get predictions for your application using simple APIs, without having to implement custom prediction generation code, or manage any infrastructure.
Visual, Interactive, Predictive Analytics for Big DataArimo, Inc.
Adatao Demo at the First Apache Spark Summit, Nikko Hotel, San Francisco, December 2, 2013
A real-time, live demo of the Adatao big-data analytics system for both Business Users and Data Scientists/Engineers. We showed terabyte data modeling in seconds on a 40-node cluster. And with a beautiful, user-friendly web app, as well as R/R-Studio & Python interfaces.
Viki collects clickstream data from clients and stores it in Hadoop and S3. They retrieve and process the data using PostgreSQL. The data requires extensive cleaning and transforming to standardize fields and properties before analysis. Viki builds query reports and dashboards to visualize the data and provide insights to stakeholders. They manage dependencies between jobs using Azkaban and present findings through daily reports and an interactive data explorer tool.
Dataiku productive application to production - pap is may 2015 Dataiku
This document discusses the development of predictive applications and outlines a vision for a platform called "Blue Box" that could help address many of the challenges in building and deploying these applications at scale. It notes that building predictive applications currently requires integrating multiple separate components. The document then describes desired features for the Blue Box platform, such as data cleansing, external data integration, model updating, decision logic, auditing, and serving predictions in real-time. It poses questions about how such a platform could be created, whether through open source or a commercial offering.
Ensuring Efficiency and Speed with Practical Solutions for Clinical OperationsOnePlan Solutions
Clinical operations professionals encounter unique challenges. Balancing regulatory requirements, tight timelines, and the need for cross-functional collaboration can create significant internal pressures. Our upcoming webinar will introduce key strategies and tools to streamline and enhance clinical development processes, helping you overcome these challenges.
The document provides an introduction to machine learning, including:
- An overview of machine learning and its uses like fraud detection, personalization, and predictive maintenance.
- The basics of machine learning including the goal of finding the best model to predict outcomes, and common algorithms like linear regression, logistic regression, and k-means clustering.
- Demo examples of linear and logistic regression using scikit-learn in Python.
- Quotes about machine learning from Jeff Bezos and its potential.
The Sum of our Parts: the Complete CARTO Journey [CARTO]CARTO
In this webinar, we put all of the pieces together - showing, using the example of the Real Estate market in Los Angeles, how the CARTO platform powers a data-to-decision workflow, showcasing every step along the way.
Watch it now at: https://go.carto.com/sum-parts-complete-carto-journey-webinar-recorded
Real-Time Personalized Customer Experiences at Bonobos (RET203) - AWS re:Inve...Amazon Web Services
In this session, learn how Bonobos, an online retailer for men's clothing and accessories, powers their personalized customer experiences on top of AWS. We start by exploring the foundational elements required to build an effective retail data platform as well as the building blocks provided by AWS to deliver these experiences. Learn how Bonobos leverages Segment in their architecture, and hear from Bonobos and Segment on the objectives, challenges, and outcomes realized by Bonobos through their journey in constructing and deploying their personalization platform.
RStudio uses data from its products and customer interactions to improve user experience and product quality. Data is collected from tools in RStudio's data science toolchain including exploration, analysis, modeling, visualization and communication packages. This data is cleaned, transformed and visualized to understand user needs. Insights inform improvements to RStudio products, documentation and support processes. The goal is to enhance reproducibility, scalability and measurability of improvements through a data-driven approach.
Customer Clustering for Retailer MarketingJonathan Sedar
This was a 90 min talk given to the Dublin R user group in Nov 2013. It describes how one might go about a data analysis project using the R language and packages, using an open source dataset.
Implementing a data_science_project (Python Version)_part1Dr Sulaimon Afolabi
This document outlines the steps for loading and merging crime data from multiple sources for machine learning analysis. It begins by introducing the datasets: Dataset A contains crime reports and police stations; Dataset B contains police station populations; and Dataset C contains police station coordinates. It then describes loading each dataset and checking for duplicates. The datasets are merged by joining Dataset A and B on the police station feature, and joining the resulting dataset with Dataset C on police station. The final merged dataset will contain crime statistics, population estimates, and location data for analysis.
CARTO en 5 Pasos: del Dato a la Toma de Decisiones [CARTO]CARTO
En este webinar repasamos - mediante una demostración con el mercado de Real Estate de Los Angeles como ejemplo - cada uno de los cinco pasos que la plataforma de CARTO sigue para una toma de decisiones eficaz basada en los datos.
Watch it now at: https://go.carto.com/carto-pasos-dato-toma-decisiones-recorded
Google Analytics for Beginners - TrainingRuben Vezzoli
I used this presentation for an internal training about Google Analytics and Web Analytics.
Google Analytics Training for Beginners.
Google Analytics Tutorial
Google Analytics for Dummies
Google Analytics Guide
The document describes using predictive analytics to help increase sales by 30% for an online pet store. It performs principal component analysis on the store's 2012-2014 data to identify key variables accounting for 99% of variability. It builds a linear regression model predicting units sold based on these variables. A sensitivity analysis varies the variables to identify combinations that could achieve the 30% sales goal.
Most today's software is highly static, even if it is written in a dynamic language like Smalltalk. Developers are not encouraged to extend the frameworks they are using; and end-users are unable to change the features of their software without initiating a new development effort. In contrast, extensible software is designed for change; and customizable software can be adapted to new needs without requiring an in-depth knowledge of the underlying implementation domain.
In this presentation I will investigate on how to write truly dynamic software and I will distill common patterns of software customizability. As running examples I present tools that I worked on during my path of discovering Smalltalk. One of these examples is Magritte, a dynamic meta-model that gives end-users the possibility to customize their applications without the need of an additional development effort. Another example is Helvetia, an infrastructure enabling on-the-fly customization of the programming language and development environment.
1. The document discusses the power of data visualization and different charting libraries for JavaScript including D3.js, Chart.js, Dygraphs, and Google Charts.
2. It provides information on each library's learning curve, lines of code needed for basic charts, level of customization, interactivity, and best uses.
3. The recap emphasizes defining the chart purpose, choosing a chart type based on data, designing the chart, selecting a library, and iterating the process to effectively communicate with data visualization.
GraphTour - How to Build Next-Generation Solutions using Graph DatabasesNeo4j
The document provides an agenda and overview of a presentation about graph-based solutions using Neo4j for telecom services. The presentation covers topics like service assurance in telecoms using Neo4j to model dependencies, performing impact and root cause analysis, and using graphs for governance, metadata management and GDPR compliance through concepts like data modeling, transformation, and entitlement management. The conclusions emphasize that graphs are well-suited for these domains and that the combination of domain expertise with Neo4j's graph capabilities provides powerful solutions.
The document discusses various methods of visualizing data through charts and graphs. It covers different formats for storing and accessing data (such as CSV, JSON, XML), engines for generating visualizations (Google Chart Tools, Google Fusion Tables), and techniques for embedding visualizations (canvas tag, scripted charts, templated visualizations). It also addresses issues around separating data from visualizations and where the data is located in the visualization process.
Data Science. .Net/C# Monte Carlo modeling. The R Programming language. See it all come together in one place in this talk. Presentation date 6/13 at Lake County .NET User Group.
45min talk given at LondonR March 2014 Meetup.
The presentation describes how one might go about an insights-driven data science project using the R language and packages, using an open source dataset.
The document discusses data warehouses and multi-dimensional data analysis. It provides an example of modeling sales data across multiple tables in a relational database and shows how long queries take on large amounts of data. It then introduces dimensional modeling and how to structure data in a data warehouse using fact and dimension tables. Finally, it demonstrates how to model the sales data multidimensionally using a data cube and discusses OLAP technologies like Mondrian for multi-dimensional analysis.
Amazon Machine Learning is a service that makes it easy for developers of all skill levels to use machine learning technology. Amazon Machine Learning provides visualization tools and wizards that guide you through the process of creating machine learning (ML) models without having to learn complex ML algorithms and technology. Once your models are ready, Amazon Machine Learning makes it easy to get predictions for your application using simple APIs, without having to implement custom prediction generation code, or manage any infrastructure.
Visual, Interactive, Predictive Analytics for Big DataArimo, Inc.
Adatao Demo at the First Apache Spark Summit, Nikko Hotel, San Francisco, December 2, 2013
A real-time, live demo of the Adatao big-data analytics system for both Business Users and Data Scientists/Engineers. We showed terabyte data modeling in seconds on a 40-node cluster. And with a beautiful, user-friendly web app, as well as R/R-Studio & Python interfaces.
Viki collects clickstream data from clients and stores it in Hadoop and S3. They retrieve and process the data using PostgreSQL. The data requires extensive cleaning and transforming to standardize fields and properties before analysis. Viki builds query reports and dashboards to visualize the data and provide insights to stakeholders. They manage dependencies between jobs using Azkaban and present findings through daily reports and an interactive data explorer tool.
Dataiku productive application to production - pap is may 2015 Dataiku
This document discusses the development of predictive applications and outlines a vision for a platform called "Blue Box" that could help address many of the challenges in building and deploying these applications at scale. It notes that building predictive applications currently requires integrating multiple separate components. The document then describes desired features for the Blue Box platform, such as data cleansing, external data integration, model updating, decision logic, auditing, and serving predictions in real-time. It poses questions about how such a platform could be created, whether through open source or a commercial offering.
Ensuring Efficiency and Speed with Practical Solutions for Clinical OperationsOnePlan Solutions
Clinical operations professionals encounter unique challenges. Balancing regulatory requirements, tight timelines, and the need for cross-functional collaboration can create significant internal pressures. Our upcoming webinar will introduce key strategies and tools to streamline and enhance clinical development processes, helping you overcome these challenges.
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...Ortus Solutions, Corp
Join us for a session exploring CommandBox 6’s smooth website transition and efficient deployment. CommandBox revolutionizes web development, simplifying tasks across Linux, Windows, and Mac platforms. Gain insights and practical tips to enhance your development workflow.
Come join us for an enlightening session where we delve into the smooth transition of current websites and the efficient deployment of new ones using CommandBox 6. CommandBox has revolutionized web development, consistently introducing user-friendly enhancements that catalyze progress in the field. During this presentation, we’ll explore CommandBox’s rich history and showcase its unmatched capabilities within the realm of ColdFusion, covering both major variations.
The journey of CommandBox has been one of continuous innovation, constantly pushing boundaries to simplify and optimize development processes. Regardless of whether you’re working on Linux, Windows, or Mac platforms, CommandBox empowers developers to streamline tasks with unparalleled ease.
In our session, we’ll illustrate the simple process of transitioning existing websites to CommandBox 6, highlighting its intuitive features and seamless integration. Moreover, we’ll unveil the potential for effortlessly deploying multiple websites, demonstrating CommandBox’s versatility and adaptability.
Join us on this journey through the evolution of web development, guided by the transformative power of CommandBox 6. Gain invaluable insights, practical tips, and firsthand experiences that will enhance your development workflow and embolden your projects.
These are the slides of the presentation given during the Q2 2024 Virtual VictoriaMetrics Meetup. View the recording here: https://www.youtube.com/watch?v=hzlMA_Ae9_4&t=206s
Topics covered:
1. What is VictoriaLogs
Open source database for logs
● Easy to setup and operate - just a single executable with sane default configs
● Works great with both structured and plaintext logs
● Uses up to 30x less RAM and up to 15x disk space than Elasticsearch
● Provides simple yet powerful query language for logs - LogsQL
2. Improved querying HTTP API
3. Data ingestion via Syslog protocol
* Automatic parsing of Syslog fields
* Supported transports:
○ UDP
○ TCP
○ TCP+TLS
* Gzip and deflate compression support
* Ability to configure distinct TCP and UDP ports with distinct settings
* Automatic log streams with (hostname, app_name, app_id) fields
4. LogsQL improvements
● Filtering shorthands
● week_range and day_range filters
● Limiters
● Log analytics
● Data extraction and transformation
● Additional filtering
● Sorting
5. VictoriaLogs Roadmap
● Accept logs via OpenTelemetry protocol
● VMUI improvements based on HTTP querying API
● Improve Grafana plugin for VictoriaLogs -
https://github.com/VictoriaMetrics/victorialogs-datasource
● Cluster version
○ Try single-node VictoriaLogs - it can replace 30-node Elasticsearch cluster in production
● Transparent historical data migration to object storage
○ Try single-node VictoriaLogs with persistent volumes - it compresses 1TB of production logs from
Kubernetes to 20GB
● See https://docs.victoriametrics.com/victorialogs/roadmap/
Try it out: https://victoriametrics.com/products/victorialogs/
How GenAI Can Improve Supplier Performance Management.pdfZycus
Data Collection and Analysis with GenAI enables organizations to gather, analyze, and visualize vast amounts of supplier data, identifying key performance indicators and trends. Predictive analytics forecast future supplier performance, mitigating risks and seizing opportunities. Supplier segmentation allows for tailored management strategies, optimizing resource allocation. Automated scorecards and reporting provide real-time insights, enhancing transparency and tracking progress. Collaboration is fostered through GenAI-powered platforms, driving continuous improvement. NLP analyzes unstructured feedback, uncovering deeper insights into supplier relationships. Simulation and scenario planning tools anticipate supply chain disruptions, supporting informed decision-making. Integration with existing systems enhances data accuracy and consistency. McKinsey estimates GenAI could deliver $2.6 trillion to $4.4 trillion in economic benefits annually across industries, revolutionizing procurement processes and delivering significant ROI.
Flutter vs. React Native: A Detailed Comparison for App Development in 2024dhavalvaghelanectarb
Choosing the right framework for your cross-platform mobile app can be a tough decision. Both Flutter and React Native offer compelling features and have earned their place in the development world. Here is a detailed comparison to help you weigh their strengths and weaknesses. Here are the pros and cons of developing mobile apps in React Native vs Flutter.
Boost Your Savings with These Money Management AppsJhone kinadey
A money management app can transform your financial life by tracking expenses, creating budgets, and setting financial goals. These apps offer features like real-time expense tracking, bill reminders, and personalized insights to help you save and manage money effectively. With a user-friendly interface, they simplify financial planning, making it easier to stay on top of your finances and achieve long-term financial stability.
Orca: Nocode Graphical Editor for Container OrchestrationPedro J. Molina
Tool demo on CEDI/SISTEDES/JISBD2024 at A Coruña, Spain. 2024.06.18
"Orca: Nocode Graphical Editor for Container Orchestration"
by Pedro J. Molina PhD. from Metadev
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Paul Brebner
Closing talk for the Performance Engineering track at Community Over Code EU (Bratislava, Slovakia, June 5 2024) https://eu.communityovercode.org/sessions/2024/why-apache-kafka-clusters-are-like-galaxies-and-other-cosmic-kafka-quandaries-explored/ Instaclustr (now part of NetApp) manages 100s of Apache Kafka clusters of many different sizes, for a variety of use cases and customers. For the last 7 years I’ve been focused outwardly on exploring Kafka application development challenges, but recently I decided to look inward and see what I could discover about the performance, scalability and resource characteristics of the Kafka clusters themselves. Using a suite of Performance Engineering techniques, I will reveal some surprising discoveries about cosmic Kafka mysteries in our data centres, related to: cluster sizes and distribution (using Zipf’s Law), horizontal vs. vertical scalability, and predicting Kafka performance using metrics, modelling and regression techniques. These insights are relevant to Kafka developers and operators.
A neural network is a machine learning program, or model, that makes decisions in a manner similar to the human brain, by using processes that mimic the way biological neurons work together to identify phenomena, weigh options and arrive at conclusions.
Enhanced Screen Flows UI/UX using SLDS with Tom KittPeter Caitens
Join us for an engaging session led by Flow Champion, Tom Kitt. This session will dive into a technique of enhancing the user interfaces and user experiences within Screen Flows using the Salesforce Lightning Design System (SLDS). This technique uses Native functionality, with No Apex Code, No Custom Components and No Managed Packages required.
Hyperledger Besu 빨리 따라하기 (Private Networks)wonyong hwang
Hyperledger Besu의 Private Networks에서 진행하는 실습입니다. 주요 내용은 공식 문서인https://besu.hyperledger.org/private-networks/tutorials 의 내용에서 발췌하였으며, Privacy Enabled Network와 Permissioned Network까지 다루고 있습니다.
This is a training session at Hyperledger Besu's Private Networks, with the main content excerpts from the official document besu.hyperledger.org/private-networks/tutorials and even covers the Private Enabled and Permitted Networks.
Secure-by-Design Using Hardware and Software Protection for FDA ComplianceICS
This webinar explores the “secure-by-design” approach to medical device software development. During this important session, we will outline which security measures should be considered for compliance, identify technical solutions available on various hardware platforms, summarize hardware protection methods you should consider when building in security and review security software such as Trusted Execution Environments for secure storage of keys and data, and Intrusion Detection Protection Systems to monitor for threats.
The Rising Future of CPaaS in the Middle East 2024Yara Milbes
Explore "The Rising Future of CPaaS in the Middle East in 2024" with this comprehensive PPT presentation. Discover how Communication Platforms as a Service (CPaaS) is transforming communication across various sectors in the Middle East.
4. Data Visualization
Data visualization is basically presentation of data with graphics. It's a way to summarize your findings and display it in a
form that facilitates interpretation and can help in identifying patterns or trends
6. Importance of Data Visualization
A picture is worth a 1000 words. Humans are easily attracted to visuals and mostof us preferto understand a particular
scenario through pictures instead of text
It is faster to recognize a result than to read a paragraph. When done well, visualizations explain complexideas simply.
Oftentimes charts tell the story much faster than a prose descriptionor a table presentation
7. Importance of Data Visualization
Retailer country
Order method
type
Retailer type Product line Product type Product Year Revenue Quantity Gross margin
United States Telephone Golf Shop
Personal
Accessories
Navigation Trail Master 2012 1095 3 0.34767123
United States Telephone Department Store
Camping
Equipment
Sleeping Bags Hibernator 2012 160103.2 1160 0.3769019
United States Telephone Department Store
Camping
Equipment
Sleeping Bags
Hibernator Self -
Inflating Mat
2012 66514.28 556 0.5440107
United States Telephone Department Store
Camping
Equipment
Sleeping Bags Hibernator Pad 2012 16205.73 411 0.51382196
United States Telephone Department Store
Camping
Equipment
Sleeping Bags Hibernator Pillow 2012 33520.42 2475 0.46298346
United States Fax Outdoors Shop
Camping
Equipment
Cooking Gear
TrailChef Water
Bag
2013 19418.52 3102 0.53194888
United States Fax Outdoors Shop
Camping
Equipment
Cooking Gear TrailChef Cook Set 2013 42304.32 794 0.34365616
United States Fax Outdoors Shop
Camping
Equipment
Cooking Gear
TrailChef Single
Flame
2013 52266.32 824 0.26880025
United States Telephone Department Store
Camping
Equipment
Packs
Canyon Mule
Journey Backpack
2012 235660.36 676 0.38805542
United States Telephone Department Store
Camping
Equipment
Packs
Canyon Mule
Cooler
2012 53822.16 1652 0.50890117
United States Web Golf Shop
Personal
Accessories
Watches Venue 2014 73949 1013 0.42858294
United States Web Golf Shop
Personal
Accessories
Watches Infinity 2014 157890.2 665 0.45986527
United States Web Golf Shop
Personal
Accessories
Watches Lux 2014 67265.2 396 0.48654044
After looking at the data
Is it possible to answer the following questions?
• Which order method type has highest revenue
?
• Which product line has highest quantity ?
Below is the sample data: Sales Products
8. Importance of Data Visualization
Afterdisplaying the data in the form of graphs . Lets try answering the questions
0
50
100
150
200
250
300
350
400
Camping
Equipment
Golf Equipment Mountaineering
Equipment
Product line
Outdoor
Protection
Personal
Accessories
Quantity
E-mail
Fax
Sales visit Telephone
Web
E-mail
Fax
Sales visit
Telephone
Web
9. Importance of Data Visualization
Afterdisplaying the data in the form of graphs . Lets try answering the questions
E-mail
Fax
Sales visit
Telephone
Web
E-mail
Fax
Sales visit
Telephone
Web
Web
Which ordermethod type has highest revenue ?
Web ordermethod type has highest revenue
among all the ordermethod type.
Moreover looking at the pie chart you can also
tell that which ordermethod type has lowest
revenue : Email.
10. Importance of Data Visualization
Afterdisplaying the data in the form of graphs . Lets try answering the questions
Which productline has highest quantity ?
0
50
100
150
300
250
200
350
400
Camping
Equipment
Golf Equipment Mountaineering
Equipment
Product line
Outdoor
Protection
Personal
Accessories
Quantity
From the graph it can be easily seen that
PersonalAccessorieshas highest number
among all the productline.
26. Grammar of Graphics
Every form of communicationneeds to have grammar. Since, visualization is also a form of communication, it needs to
have a foundation of grammar
I am John Am John I
27. Components of Grammar of Graphics
Element Description
Data The data-set forwhich we wouldwantto plot a graph
Aesthetics The metrics onto whichwe plot our data
Geometry Visual elements to plot the data
Facet Groups by which wedivide the data
29. Visualization with ggplot2
ggplot2 is a system for declaratively creating
graphics, based on the Grammar of Graphics. You
provide the data, tell ggplot2 how to map variables
to aesthetics,what graphical primitives to use, and it
takes care of the details
44. geom_point()
geom_point() function helps in making scatterplots with ggplot2.Ascatter plot helps in understanding how does one
variable change w.r.t another variable. It is used for two continuous values
50. geom_boxplot()
geom_boxplot() function helps in making boxplots with ggplot2. Box Plot shows 5 statistically significant numbers- the
minimum, the 25th percentile, the median, the 75th percentile and the maximum. It is thus useful for visualizing the spread of
the data and deriving inferences accordingly
54. Faceting the data
facet_grid() is used to facet the data. Faceting is used when the plot is too chaotic and some variables have to be grouped
into different facets to have a better visualization
59. Theme Layer
Initial Graph
ggplot(data = customer_churn,aes(x=tenure))+
geom_histogram(fill="tomato3",
col="mediumaquamarine") -> g1
After Adding Title
g1+labs(title = "Distribution of tenure")->g2
60. Theme Layer
After Adding Title
g1+labs(title = "Distribution of tenure")->g2
After Adding Panel Background
g2+theme(panel.background =
element_rect(fill = "olivedrab3"))->g3
64. Quiz
Which of these is the correctcode to make a bar-plot for the ‘TechSupport’column. The colorof the
bars should be ‘blue’ & the title of the plot should be ‘Distribution of Tech Support’
1. plot(customer_churn$TechSupport,col="blue", main="Distributionof TechSupport")
2. plot(customer_churn$TechSupport,fill="blue", main="Distribution of Tech Support")
3. plot(customer_churn$TechSupport,col="blue", title="Distributionof Tech Support")
4. plot(customer_churn$TechSupport,color="blue",title="Distributionof TechSupport")
65. Quiz
Which of these is the correctcode to make a histogram for the ‘tenure’column. The fill color of the bins
should be ‘azure’ & the number of bins should be 87
1. ggplot(data= customer_churn,aes(x=tenure,col='azure'))+geom_histogram(bins=87)
2. ggplot(data = customer_churn,aes(x=tenure))+geom_histogram(col="azure",bins=87)
3. ggplot(data= customer_churn,aes(x=tenure))+geom_histogram(fill="azure",bins=87)
4. ggplot(data= customer_churn,aes(x=tenure,fill='azure'))+geom_histogram(bins=87)
66. Quiz
Which of these is the correct code to make a bar-plot for the ‘OnlineBackup’ column. The color of the bars
should be determined by the ‘PhoneService’ column
1. ggplot(data= customer_churn,aes(fill=OnlineBackup,x=PhoneService))+geom_bar()
2. ggplot(data= customer_churn,aes(y=OnlineBackup,fill=PhoneService))+geom_bar()
3. ggplot(data= customer_churn,aes(x=OnlineBackup))+geom_bar(fill=PhoneService)
4. ggplot(data= customer_churn,aes(x=OnlineBackup,fill=PhoneService))+geom_bar()
67. Quiz
To which of these geometries can you add the facet_grid()?
1. geom_bar()
2. geom_histogram()
3. geom_point()
4. All of the above