Spotkanie Entuzjastów R, Politechnika Warszawska, 17.03.2016.
Przykłady: wizualizacja danych, klasteryzacja użytkowników, wykrywanie anomalii, prognozowanie w Google Analytics i R.
GRONINGEN PHP - MySQL 8.0 , not only good, greatGabriela Ferrara
Sick and tired of “X technology is only good for starting out; after you do, move to Y”? Good news - you don’t need to move away, you just need to get in further! In this talk, you’ll learn about improvements in the newest version of the most used database in the world. What are Window Functions? How do you use CTEs? How can the new default encoding help me and what should I look for when upgrading versions? Is MySQL just an OLAP database or there is more to it?
Javascript is often referred to as the assembly language of the web. It is used to add interactivity and dynamic behavior to web pages. Some key uses of Javascript include making asynchronous requests to servers for dynamic content updates without reloading the entire page, manipulating elements on a page through actions like hiding/showing elements, and communicating with web services to retrieve and display content in formats like JSON and XML. Frameworks built on Javascript like jQuery, SproutCore, Google Web Toolkit, and Cappuccino further enhance its capabilities for building interactive web applications.
The document describes a case study of segmenting website users into groups of interests based on their online behavior without requiring registration. It involves using Google Analytics and Google Tag Manager to collect data on users' page views on different content levels (beginner, intermediate, advanced). K-means clustering is then performed in R to categorize users into 3 interest groups based on their viewing patterns. The results are visualized and can be used for targeted marketing campaigns.
The document discusses how websites track users behind the scenes through cookies, device fingerprinting, and injected tracking code from third parties. It notes that 12% of scanned websites contained injected tracking code, sometimes from potential malware. The document recommends that data collection is okay only for aggregated analytics, more accurate ads, and better content, but that injecting tracking code, stealing audiences, or spying is not acceptable. It encourages scanning one's own website for tracking code.
GRONINGEN PHP - MySQL 8.0 , not only good, greatGabriela Ferrara
Sick and tired of “X technology is only good for starting out; after you do, move to Y”? Good news - you don’t need to move away, you just need to get in further! In this talk, you’ll learn about improvements in the newest version of the most used database in the world. What are Window Functions? How do you use CTEs? How can the new default encoding help me and what should I look for when upgrading versions? Is MySQL just an OLAP database or there is more to it?
Javascript is often referred to as the assembly language of the web. It is used to add interactivity and dynamic behavior to web pages. Some key uses of Javascript include making asynchronous requests to servers for dynamic content updates without reloading the entire page, manipulating elements on a page through actions like hiding/showing elements, and communicating with web services to retrieve and display content in formats like JSON and XML. Frameworks built on Javascript like jQuery, SproutCore, Google Web Toolkit, and Cappuccino further enhance its capabilities for building interactive web applications.
The document describes a case study of segmenting website users into groups of interests based on their online behavior without requiring registration. It involves using Google Analytics and Google Tag Manager to collect data on users' page views on different content levels (beginner, intermediate, advanced). K-means clustering is then performed in R to categorize users into 3 interest groups based on their viewing patterns. The results are visualized and can be used for targeted marketing campaigns.
The document discusses how websites track users behind the scenes through cookies, device fingerprinting, and injected tracking code from third parties. It notes that 12% of scanned websites contained injected tracking code, sometimes from potential malware. The document recommends that data collection is okay only for aggregated analytics, more accurate ads, and better content, but that injecting tracking code, stealing audiences, or spying is not acceptable. It encourages scanning one's own website for tracking code.
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)Dan Robinson
At Heap, we lean on PostgreSQL for all our backend heavy lifting. We support an expressive set of queries — conversion funnels with filtering and grouping, retention analysis, and behavioral cohorting to name a few — across billions of users and tens of billions of events. Results need to come back in a matter of seconds and reflect up-to-the-minute data.
This talk will discuss these challenges, with a particular focus on:
- Using CitusDB for interactive analysis across 50 terabytes of data and counting.
- PostgreSQL and Kafka: two great tastes that taste great together.
- UDFs in C and PL/pgSQL, partial indexes for pre-aggregation, and other tricks up our sleeves.
Digital analytics with R - Sydney Users of R Forum - May 2015Johann de Boer
This document discusses using the ganalytics R package to access and analyze Google Analytics data through R. It provides an overview of Google Analytics and its APIs, demonstrates how to build queries with ganalytics, extract and summarize data in R. It also discusses enhancing ganalytics by improving documentation, testing, adding features, and internationalization. The document encourages participation in open source development of the package.
The document discusses best practices for mobile analytics and A/B testing. It recommends event-based analytics over page view counting and outlines steps for designing analytics and experiments: 1) define business and UX goals, 2) determine questions, 3) map events to answer questions, and 4) build user paths and funnels. Code snippets show implementing analytics tracking and automated testing to prevent regressions. Colorized logging and Bonjour logging are suggested for fast feedback.
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...Altinity Ltd
OSA Con 2022: Building Event Collection SDKs and Data Models
Paul Boocock - Snowplow
In this talk we'll go through how we have designed and built over 20 different SDKs to collect events from all sorts of applications (from web & mobile to IoT to server-side), allowing users to collect a rich event stream of data. Then we'll dive into, and demonstrate, the cross-warehouse downstream data models which aggregate the event stream into easy-to-consume data products for analytics, AI, composable CDP, recommendation engines, and many other use cases.
Google Analytics es una herramienta de analítica la que se conoce sólo una parte de su potencial. Además de medir audiencias y su comportamiento, Google Analytics permite priorizar las inversiones en marketing online, recoger comportamientos de Single Page Applications y visualizar datos offline, por ejemplo de CRM y combinarlos con los de visitas online. También es posible recoger datos en tiempo real de ventas, por ejemplo de ecommerce y de dispositivos físicos como bluetooth beacons. Las funcionalidades de Google Analytics, en combinación con Big Query y otros servicios de Google Cloud Platform, la convierte en una de las plataformas más interesantes de analítica para la transformación digital.
Si quieres ver el vídeo en el que fue usada esta presentación, pulsa aquí: https://www.youtube.com/watch?v=2mfIU-NXGXI
Para ver la convocatoria en nuestra web, clic aquí: https://www.paradigmadigital.com/eventos/usar-google-analytics/
La convocatoria a través del grupo de Meetup.com, clic aquí: https://www.meetup.com/es-ES/Front-end-Developers-Madrid/events/231793469/
D3.js - A picture is worth a thousand wordsApptension
This document provides an overview of D3.js, a JavaScript library for data visualization. It discusses why data visualization is useful, some key concepts in D3 like selections, entering and updating data, and creating reusable components. It also covers transitions, scales, axes, SVG, and common layouts. The document encourages exploring more examples on the bl.ocks website and concludes by thanking the audience.
Running Intelligent Applications inside a Database: Deep Learning with Python...Miguel González-Fierro
In this talk we present a new paradigm of computation where the intelligence is computed inside the database. Standard software systems must get the data from the database to execute a routine. If the size of the data is big, there are inefficiencies due to the data movement. Store procedures tried to solve this issue in the past, allowing for computing simple functions inside the database. However, only simple routines can be executed.
To showcase the capabilities of our new system, we created a lung cancer detection algorithm using Microsoft’s Cognitive Toolkit, also known as CNTK. We used transfer learning between ImageNet dataset, which contains natural images, and a lung cancer dataset, which contains scans of horizontal sections of the lung for healthy and sick patients. Specifically, a pretrained Convolutional Neural Network on ImageNet is used on the lung cancer dataset to generate features. Once the features are computed, a boosted tree is applied to predict whether the patient has cancer or not.
All this process is computed inside the database, so the data movement is minimized. We are even able to execute the algorithm using the GPU of the virtual machine that hosts the database. Using a GPU, we can compute the featurization in less than 1h, in contrast to using a CPU, that would take up to 32h. Finally, we set up an API to connect the solution to a web app, where a doctor can analyze the images and get a prediction of a patient.
The document discusses using jQuery and custom data attributes to add client-side behavior and interactivity to Oracle APEX applications. It introduces:
- The data attribute for unambiguously identifying elements
- jQuery for element selection, event handling, and AJAX
- Changing page items to HTML5 input types using data attributes
- A rowclick plugin for adding click handling to report rows
- Record sorting in reports using jQuery sortable
- Deleting records from reports using click events and PL/SQL processing
The document provides code examples and discusses building interactive features like record sorting and deletion without custom coding.
This document provides an overview and summary of new features for AdWords scripts, including:
1) Bulk upload which allows making bulk changes by uploading data in CSV format from various sources.
2) Managing display criteria like keywords, placements, topics and audiences for both inclusion and exclusion.
3) Working with existing shopping campaigns, product groups, and running shopping reports.
4) Integrating with Google services for external data and advanced APIs like Analytics, BigQuery, Calendar and Tasks.
This document provides an overview of Google Analytics for developers. It describes Google Analytics as a platform for reporting on visitor behavior across websites, mobile apps, connected devices, and offline activities. It outlines the process for setting up Google Analytics tracking on a website or mobile app, including signing up for an account, creating properties, and inserting the tracking code. It also discusses metrics, dimensions, event tracking, and using Google Tag Manager.
The document discusses AdWords scripts for managing multiple AdWords accounts from a central MCC account using JavaScript. It provides an overview of MCC scripts, how to get started with them, and examples of common tasks like accessing child accounts, selecting a specific account, processing accounts in parallel, and returning results.
Javascript unit testing with QUnit and SinonLars Thorup
This document discusses JavaScript unit testing with QUnit and Sinon. It introduces Lars Thorup and his background in software development, testing, and coaching. It then provides an overview of unit testing, explaining why it is beneficial and how to implement it. Finally, it demonstrates various QUnit and Sinon techniques for writing tests, including assertions, spies, stubs, mocks, asynchronous code, the DOM, and advanced mocking.
SQL PASS 2017 - Building one million predictions per second using SQL Server ...Amit Banerjee
Using the power of OLTP and data transformation in SQL 2016 and advanced analytics in Microsoft R Server, various industries that really push the boundary of processing higher number of transaction per second (tps) for different use cases. In this talk, we will walk through the use case of predicting loan charge off (loan default) rate, architecture configuration that enable this use case, and rich visual dashboard that allow customer to do what-if analysis. Attend this session to find out how SQL + R allows you to build an “intelligent data warehouse”.
This is a talk which builds on my previous talk on how SQL Server 2016 helps build an intelligent data warehouse.
RStudio uses data from its products and customer interactions to improve user experience and product quality. Data is collected from tools in RStudio's data science toolchain including exploration, analysis, modeling, visualization and communication packages. This data is cleaned, transformed and visualized to understand user needs. Insights inform improvements to RStudio products, documentation and support processes. The goal is to enhance reproducibility, scalability and measurability of improvements through a data-driven approach.
Simplify Feature Engineering in Your Data WarehouseFeatureByte
Feature Engineering is critical to successful delivery of AI solutions. Crafting relevant features from organization data requires business domain knowledge and creativity, powered by human capital in data science teams.
With growing adoption of machine learning and AI in organizations, there is a pressing need to develop processes around ML development and deployment to maximize productivity with limited resources. While there is no lack of tools for ML model management, solutions for feature engineering remains inadequate.
In this presentation, we outline our approach and design to make feature engineering efficient, repeatable and enjoyable for data science practitioners so they can experiment and iterate fast, without overlooking important issues such as scalability, deployment and auditability.
Most metrics systems link timeseries to a string key, some add a few tags. They often lack information, use inconsistent formats and terminology, and are poorly organized. As the amount of people and software generating, processing, storing and visualizing metrics grows, this approach becomes very cumbersome and there is a lot to be gained from taking a step back and re-thinking metric identifiers and metadata.
Metrics 2.0 is a set of conventions around metrics: With barely any extra work metrics become self-describing and standardized. Compatibility between tools increases dramatically, dashboards can automatically convert information needs into graphs, graph renderers can present data more usefully, anomaly detection and aggregators can work more autonomously and avoid common mistakes. Result: less micromanaging of software and configuration, quicker results, more clarity. Less frustration and room for errors.
This talk will also cover the tools that turn this concept into production-ready reality:
Graph-Explorer is an application that integrates with Graphite. Enter an expression that represents an information need and it generates the corresponding graphs or alerting rules, automatically applying unit conversion, aggregation, processing, etc.
Statsdaemon is an aggregation daemon like Etsy's Statsd that expresses performed aggregations and statistical operations by updating the metrics tags, making sure that the metric metadata always corresponds to the data.
Dieter Plaetinck is a systems-gone-backend engineer at Vimeo.
The document discusses using the Google Custom Search API to build a web application for searching definitions from the SML Basis Standard Library. It provides steps to generate a code snippet from the Google Custom Search and paste it into a page to display search results. It also describes using the Google JavaScript API to make search requests from the client-side and process the results by displaying them on the page while avoiding cross-domain errors.
Connecting Your Customers – Building Successful Mobile Games through the Powe...Amazon Web Services
Free to play is now the standard for mobile and social games. But succeeding in free-to-play is not easy: You need in-depth data analytics to gain insight into your players so you can monetize your game. Learn how to leverage new features of AWS services such as Elastic MapReduce, Amazon S3, Kinesis, and Redshift to build an end-to-end analytics pipeline. Plus, we’ll show you how to easily integrate analytics with other AWS services in your game.
The document describes the Cross-Industry Standard Process for Data Mining (CRISP-DM), which is a six step process for data analysis projects. The six steps are: 1) business understanding, 2) data understanding, 3) data preparation, 4) modeling, 5) evaluation, and 6) deployment. Each step of the CRISP-DM process is explained in detail with examples provided. The overall summary is that CRISP-DM provides a standardized and reproducible method for completing data analysis projects while keeping the initial business goal and question in mind.
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)Dan Robinson
At Heap, we lean on PostgreSQL for all our backend heavy lifting. We support an expressive set of queries — conversion funnels with filtering and grouping, retention analysis, and behavioral cohorting to name a few — across billions of users and tens of billions of events. Results need to come back in a matter of seconds and reflect up-to-the-minute data.
This talk will discuss these challenges, with a particular focus on:
- Using CitusDB for interactive analysis across 50 terabytes of data and counting.
- PostgreSQL and Kafka: two great tastes that taste great together.
- UDFs in C and PL/pgSQL, partial indexes for pre-aggregation, and other tricks up our sleeves.
Digital analytics with R - Sydney Users of R Forum - May 2015Johann de Boer
This document discusses using the ganalytics R package to access and analyze Google Analytics data through R. It provides an overview of Google Analytics and its APIs, demonstrates how to build queries with ganalytics, extract and summarize data in R. It also discusses enhancing ganalytics by improving documentation, testing, adding features, and internationalization. The document encourages participation in open source development of the package.
The document discusses best practices for mobile analytics and A/B testing. It recommends event-based analytics over page view counting and outlines steps for designing analytics and experiments: 1) define business and UX goals, 2) determine questions, 3) map events to answer questions, and 4) build user paths and funnels. Code snippets show implementing analytics tracking and automated testing to prevent regressions. Colorized logging and Bonjour logging are suggested for fast feedback.
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...Altinity Ltd
OSA Con 2022: Building Event Collection SDKs and Data Models
Paul Boocock - Snowplow
In this talk we'll go through how we have designed and built over 20 different SDKs to collect events from all sorts of applications (from web & mobile to IoT to server-side), allowing users to collect a rich event stream of data. Then we'll dive into, and demonstrate, the cross-warehouse downstream data models which aggregate the event stream into easy-to-consume data products for analytics, AI, composable CDP, recommendation engines, and many other use cases.
Google Analytics es una herramienta de analítica la que se conoce sólo una parte de su potencial. Además de medir audiencias y su comportamiento, Google Analytics permite priorizar las inversiones en marketing online, recoger comportamientos de Single Page Applications y visualizar datos offline, por ejemplo de CRM y combinarlos con los de visitas online. También es posible recoger datos en tiempo real de ventas, por ejemplo de ecommerce y de dispositivos físicos como bluetooth beacons. Las funcionalidades de Google Analytics, en combinación con Big Query y otros servicios de Google Cloud Platform, la convierte en una de las plataformas más interesantes de analítica para la transformación digital.
Si quieres ver el vídeo en el que fue usada esta presentación, pulsa aquí: https://www.youtube.com/watch?v=2mfIU-NXGXI
Para ver la convocatoria en nuestra web, clic aquí: https://www.paradigmadigital.com/eventos/usar-google-analytics/
La convocatoria a través del grupo de Meetup.com, clic aquí: https://www.meetup.com/es-ES/Front-end-Developers-Madrid/events/231793469/
D3.js - A picture is worth a thousand wordsApptension
This document provides an overview of D3.js, a JavaScript library for data visualization. It discusses why data visualization is useful, some key concepts in D3 like selections, entering and updating data, and creating reusable components. It also covers transitions, scales, axes, SVG, and common layouts. The document encourages exploring more examples on the bl.ocks website and concludes by thanking the audience.
Running Intelligent Applications inside a Database: Deep Learning with Python...Miguel González-Fierro
In this talk we present a new paradigm of computation where the intelligence is computed inside the database. Standard software systems must get the data from the database to execute a routine. If the size of the data is big, there are inefficiencies due to the data movement. Store procedures tried to solve this issue in the past, allowing for computing simple functions inside the database. However, only simple routines can be executed.
To showcase the capabilities of our new system, we created a lung cancer detection algorithm using Microsoft’s Cognitive Toolkit, also known as CNTK. We used transfer learning between ImageNet dataset, which contains natural images, and a lung cancer dataset, which contains scans of horizontal sections of the lung for healthy and sick patients. Specifically, a pretrained Convolutional Neural Network on ImageNet is used on the lung cancer dataset to generate features. Once the features are computed, a boosted tree is applied to predict whether the patient has cancer or not.
All this process is computed inside the database, so the data movement is minimized. We are even able to execute the algorithm using the GPU of the virtual machine that hosts the database. Using a GPU, we can compute the featurization in less than 1h, in contrast to using a CPU, that would take up to 32h. Finally, we set up an API to connect the solution to a web app, where a doctor can analyze the images and get a prediction of a patient.
The document discusses using jQuery and custom data attributes to add client-side behavior and interactivity to Oracle APEX applications. It introduces:
- The data attribute for unambiguously identifying elements
- jQuery for element selection, event handling, and AJAX
- Changing page items to HTML5 input types using data attributes
- A rowclick plugin for adding click handling to report rows
- Record sorting in reports using jQuery sortable
- Deleting records from reports using click events and PL/SQL processing
The document provides code examples and discusses building interactive features like record sorting and deletion without custom coding.
This document provides an overview and summary of new features for AdWords scripts, including:
1) Bulk upload which allows making bulk changes by uploading data in CSV format from various sources.
2) Managing display criteria like keywords, placements, topics and audiences for both inclusion and exclusion.
3) Working with existing shopping campaigns, product groups, and running shopping reports.
4) Integrating with Google services for external data and advanced APIs like Analytics, BigQuery, Calendar and Tasks.
This document provides an overview of Google Analytics for developers. It describes Google Analytics as a platform for reporting on visitor behavior across websites, mobile apps, connected devices, and offline activities. It outlines the process for setting up Google Analytics tracking on a website or mobile app, including signing up for an account, creating properties, and inserting the tracking code. It also discusses metrics, dimensions, event tracking, and using Google Tag Manager.
The document discusses AdWords scripts for managing multiple AdWords accounts from a central MCC account using JavaScript. It provides an overview of MCC scripts, how to get started with them, and examples of common tasks like accessing child accounts, selecting a specific account, processing accounts in parallel, and returning results.
Javascript unit testing with QUnit and SinonLars Thorup
This document discusses JavaScript unit testing with QUnit and Sinon. It introduces Lars Thorup and his background in software development, testing, and coaching. It then provides an overview of unit testing, explaining why it is beneficial and how to implement it. Finally, it demonstrates various QUnit and Sinon techniques for writing tests, including assertions, spies, stubs, mocks, asynchronous code, the DOM, and advanced mocking.
SQL PASS 2017 - Building one million predictions per second using SQL Server ...Amit Banerjee
Using the power of OLTP and data transformation in SQL 2016 and advanced analytics in Microsoft R Server, various industries that really push the boundary of processing higher number of transaction per second (tps) for different use cases. In this talk, we will walk through the use case of predicting loan charge off (loan default) rate, architecture configuration that enable this use case, and rich visual dashboard that allow customer to do what-if analysis. Attend this session to find out how SQL + R allows you to build an “intelligent data warehouse”.
This is a talk which builds on my previous talk on how SQL Server 2016 helps build an intelligent data warehouse.
RStudio uses data from its products and customer interactions to improve user experience and product quality. Data is collected from tools in RStudio's data science toolchain including exploration, analysis, modeling, visualization and communication packages. This data is cleaned, transformed and visualized to understand user needs. Insights inform improvements to RStudio products, documentation and support processes. The goal is to enhance reproducibility, scalability and measurability of improvements through a data-driven approach.
Simplify Feature Engineering in Your Data WarehouseFeatureByte
Feature Engineering is critical to successful delivery of AI solutions. Crafting relevant features from organization data requires business domain knowledge and creativity, powered by human capital in data science teams.
With growing adoption of machine learning and AI in organizations, there is a pressing need to develop processes around ML development and deployment to maximize productivity with limited resources. While there is no lack of tools for ML model management, solutions for feature engineering remains inadequate.
In this presentation, we outline our approach and design to make feature engineering efficient, repeatable and enjoyable for data science practitioners so they can experiment and iterate fast, without overlooking important issues such as scalability, deployment and auditability.
Most metrics systems link timeseries to a string key, some add a few tags. They often lack information, use inconsistent formats and terminology, and are poorly organized. As the amount of people and software generating, processing, storing and visualizing metrics grows, this approach becomes very cumbersome and there is a lot to be gained from taking a step back and re-thinking metric identifiers and metadata.
Metrics 2.0 is a set of conventions around metrics: With barely any extra work metrics become self-describing and standardized. Compatibility between tools increases dramatically, dashboards can automatically convert information needs into graphs, graph renderers can present data more usefully, anomaly detection and aggregators can work more autonomously and avoid common mistakes. Result: less micromanaging of software and configuration, quicker results, more clarity. Less frustration and room for errors.
This talk will also cover the tools that turn this concept into production-ready reality:
Graph-Explorer is an application that integrates with Graphite. Enter an expression that represents an information need and it generates the corresponding graphs or alerting rules, automatically applying unit conversion, aggregation, processing, etc.
Statsdaemon is an aggregation daemon like Etsy's Statsd that expresses performed aggregations and statistical operations by updating the metrics tags, making sure that the metric metadata always corresponds to the data.
Dieter Plaetinck is a systems-gone-backend engineer at Vimeo.
The document discusses using the Google Custom Search API to build a web application for searching definitions from the SML Basis Standard Library. It provides steps to generate a code snippet from the Google Custom Search and paste it into a page to display search results. It also describes using the Google JavaScript API to make search requests from the client-side and process the results by displaying them on the page while avoiding cross-domain errors.
Connecting Your Customers – Building Successful Mobile Games through the Powe...Amazon Web Services
Free to play is now the standard for mobile and social games. But succeeding in free-to-play is not easy: You need in-depth data analytics to gain insight into your players so you can monetize your game. Learn how to leverage new features of AWS services such as Elastic MapReduce, Amazon S3, Kinesis, and Redshift to build an end-to-end analytics pipeline. Plus, we’ll show you how to easily integrate analytics with other AWS services in your game.
Similar to Google Analytics + R. Praktyczne przykłady. (20)
The document describes the Cross-Industry Standard Process for Data Mining (CRISP-DM), which is a six step process for data analysis projects. The six steps are: 1) business understanding, 2) data understanding, 3) data preparation, 4) modeling, 5) evaluation, and 6) deployment. Each step of the CRISP-DM process is explained in detail with examples provided. The overall summary is that CRISP-DM provides a standardized and reproducible method for completing data analysis projects while keeping the initial business goal and question in mind.
Google Analytics: dowiedz się więcej o użytkownikach swojego serwisu wwwMichal Brys
Skąd przychodzą użytkownicy mojego serwisu www? Jakie działania wykonują na stronie? Z czym mają problemy? Gdzie znajdziemy najbardziej wartościowych odbiorców kampanii marketingowych? Michał odpowie na te i wiele innych pytań. Pokaże, jak znaleźć na nie odpowiedź korzystając z narzędzia Google Analytics. Przedstawi również zestaw narzędzi przydatnych dla deweloperów (m.in. API) oraz case study z jego zastosowania w kontekście Big Data. Na koniec wskaże kierunki, w których rozwija się analityka internetowa i jak przygotować się do tych zmian.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
32. # Get the Sessions by Month in 2014
query.list <- Init(start.date = "2014-01-01",
end.date = "2014-01-31",
dimensions = "ga:date",
metrics = "ga:sessions",
table.id = "ga:000000")
# Create the Query Builder object
ga.query <- QueryBuilder(query.list)
# Extract the data and store it in a data-frame
ga.data <- GetReportData(ga.query, token)
54. # Get the Sessions by Month in 2014
query.list <- Init(start.date = "2014-01-01",
end.date = "2014-12-31",
dimensions = "ga:dimension01,
ga:contentGroup01",
metrics = "ga:contentGroupUniqueViews01",
table.id = "ga:000000")
# Create the Query Builder object
ga.query <- QueryBuilder(query.list)
# Extract the data and store it in a data-frame
ga.data <- GetReportData(ga.query, token)