Title: Neo4j: The World's Leading Graph DB
Speaker: George Eleftheriadis (https://gr.linkedin.com/in/george-eleftheriadis-4526ba51/)
Date: Monday, April 18, 2016
Event: https://meetup.com/Athens-Big-Data/events/229812890/
Scalding - Hadoop Word Count in LESS than 70 lines of codeKonrad Malawski
Twitter Scalding is built on top of Cascading, which is built on top of Hadoop. It's basically a very nice to read and extend DSL for writing map reduce jobs.
This is an quick introduction to Scalding and Monoids. Scalding is a Scala library that makes writing MapReduce jobs very easy. Monoids on the other hand promise parallelism and quality and they make some more challenging algorithms look very easy.
The talk was held at the Helsinki Data Science meetup on January 9th 2014.
In the past year there has been a tremendous amount of activity on Scala APIs for Hadoop. In this talk we`ll talk about writing Map/Reduce jobs in a more functional manner and explore the three most popular Scala packages for Hadoop: Scalding, Scoobi and Scrunch. Detailed usage examples will be provided for each along with some real world use cases.
Scalding: Twitter's Scala DSL for Hadoop/Cascadingjohnynek
Talk given at the 2012 Hadoop Summit in San Jose, CA.
Scalding is a Scala DSL for Cascading which brings natural functional programming to Hadoop. It is open-source, developed by Twitter and others.
Follow: twitter.com/scalding
github.com/twitter/scalding
Scalding - Hadoop Word Count in LESS than 70 lines of codeKonrad Malawski
Twitter Scalding is built on top of Cascading, which is built on top of Hadoop. It's basically a very nice to read and extend DSL for writing map reduce jobs.
This is an quick introduction to Scalding and Monoids. Scalding is a Scala library that makes writing MapReduce jobs very easy. Monoids on the other hand promise parallelism and quality and they make some more challenging algorithms look very easy.
The talk was held at the Helsinki Data Science meetup on January 9th 2014.
In the past year there has been a tremendous amount of activity on Scala APIs for Hadoop. In this talk we`ll talk about writing Map/Reduce jobs in a more functional manner and explore the three most popular Scala packages for Hadoop: Scalding, Scoobi and Scrunch. Detailed usage examples will be provided for each along with some real world use cases.
Scalding: Twitter's Scala DSL for Hadoop/Cascadingjohnynek
Talk given at the 2012 Hadoop Summit in San Jose, CA.
Scalding is a Scala DSL for Cascading which brings natural functional programming to Hadoop. It is open-source, developed by Twitter and others.
Follow: twitter.com/scalding
github.com/twitter/scalding
Query optimizers and people have one thing in common: the better they understand their data, the better they can do their jobs. Optimizing queries is hard if you don't have good estimates for the sizes of the intermediate join and aggregate results. Data profiling is a technique that scans data, looking for patterns within the data such as keys, functional dependencies, and correlated columns. These richer statistics can be used in Apache Calcite's query optimizer, and the projects that use it, such as Apache Hive, Phoenix and Drill. We describe how we built a data profiler as a table function in Apache Calcite, review the recent research and algorithms that made it possible, and show how you can use the profiler to improve the quality of your data.
A talk given by Julian Hyde at DataWorks Summit, San Jose, on June 14th 2017.
GeoMesa on Apache Spark SQL with Anthony FoxDatabricks
GeoMesa is an open-source toolkit for processing and analyzing spatio-temporal data, such as IoT and sensor-produced observations, at scale. It provides a consistent API for querying and analyzing data on top of distributed databases (e.g. HBase, Accumulo, Bigtable, Cassandra) and messaging networks (e.g. Kafka) to handle batch analysis of historical archives of data and low-latency processing of data in-stream.
GeoMesa has deep integration with Spark SQL. It has added spatial types (e.g. Point, LineString, Polygons), spatial predicates (st_contains, st_intersects, etc.), and geometry processing functions (e.g. st_buffer, st_convexHull, etc.) to Spark SQL. It also optimizes the processing of these extensions by integrating with the Catalyst SQL optimizer to intercept SQL statements with spatial predicates and provision RDDs based on the underlying spatial index.
This session will describe the implementation of the GeoMesa Spark SQL integration, illustrate its application in production systems and demonstrate spatial aggregations and analytics using map-based visualizations.
Overview of JD Long's experimental "segue" package which marshals and manages Hadoop clusters (for non-Big Data problems) with Amazon's Elastic MapReduce service.
Presented at the February 2011 meeting of the Greater Boston useR Group.
Query optimizers and people have one thing in common: the better they understand their data, the better they can do their jobs. Optimizing queries is hard if you don't have good estimates for the sizes of the intermediate join and aggregate results. Data profiling is a technique that scans data, looking for patterns within the data such as keys, functional dependencies, and correlated columns. These richer statistics can be used in Apache Calcite's query optimizer, and the projects that use it, such as Apache Hive, Phoenix and Drill. We describe how we built a data profiler as a table function in Apache Calcite, review the recent research and algorithms that made it possible, and show how you can use the profiler to improve the quality of your data.
A talk given by Julian Hyde at Apache: Big Data, Miami, on May 16th 2017.
Apache Spark - Key-Value RDD | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2sewz2m
This CloudxLab Key-Value RDD tutorial helps you to understand Key-Value RDD in detail. Below are the topics covered in this tutorial:
1) Spark Key-Value RDD
2) Creating Key-Value Pair RDDs
3) Transformations on Pair RDDs - reduceByKey(func)
4) Count Word Frequency in a File using Spark
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...StampedeCon
Learn how to model beyond traditional direct access in Apache Cassandra. Utilizing the DataStax platform to harness the power of Spark and Solr to perform search, analytics, and complex operations in place on your Cassandra data!
Data modeling is hard, especially in the world of distributed NoSQL stores. With relational databases, developers have tended to store normalized data and shape their query model around that structure. This can come back to bite you when it comes time to scale, as complex queries across dozens of tables begin to affect application performance. It’s common to find developers rethinking their data model as query latency increases under load.
With NoSQL stores, developers must consider their query patterns from the outset of application development, designing their data model to fit those patterns. A number of techniques, new and old, can be used to allow for maximum performance and scalability.
Topics covered will include: De-normalization, time boxing, conflict resolution, and convergent & commutative replicated data types. Additionally, discussions of common query patterns in light of the capabilities of various NoSQL data stores will be reviewed.
Goal Based Data Production with Sim SimeonovDatabricks
Since the invention of SQL and relational databases, data production has been about specifying how data should be transformed through queries. While Apache Spark can certainly be used as a general distributed SQL-like query engine, the power and granularity of Spark’s APIs allows for a fundamentally different, and far more productive, approach. This session will introduce the principles of goal-based data production with examples ranging from ETL, to exploratory data analysis, to feature engineering for machine learning.
Goal-based data production concerns itself with specifying WHAT the desired result is, leaving the details of HOW the result is achieved to the smart data warehouse running on top of Spark. That not only substantially increases productivity, but also significantly expands the audience that can work directly with Spark: from developers and data scientists to technical business users. With specific data and architecture patterns and live demos, this session will demonstrate how easy it is for any company to create its own smart data warehouse with Spark 2.x and gain the benefits of goal-based data production.
My name is Neta Barkay , and I'm a data scientist at LivePerson.
I'd like to share with you a talk I presented at the Underscore Scala community on "Efficient MapReduce using Scalding".
In this talk I reviewed why Scalding fits big data analysis, how it enables writing quick and intuitive code with the full functionality vanilla MapReduce has, without compromising on efficient execution on the Hadoop cluster. In addition, I presented some examples of Scalding jobs which can be used to get you started, and talked about how you can use Scalding's ecosystem, which includes Cascading and the monoids from Algebird library.
Read more & Video: https://connect.liveperson.com/community/developers/blog/2014/02/25/scalding-reaching-efficient-mapreduce
Apache Spark - Key Value RDD - Transformations | Big Data Hadoop Spark Tutori...CloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2sm5Ekd
This CloudxLab Key-Value RDD Transformations tutorial helps you to understand Key-Value RDD transformations in detail. Below are the topics covered in this tutorial:
1) Transformations on Key-Value Pair RDD - keys(), values(), groupByKey(), combineByKey(), sortByKey(), subtractByKey(), join(), leftOuterJoin(), rightOuterJoin(), cogroup(), countByKey() and lookup()
Move your data (Hans Rosling style) with googleVis + 1 line of R codeJeffrey Breen
R's googleVis package makes it easy to use the Google Visualization API with your data. Here we demonstrate how to create a Hans Rosling-style motion chart with some sample data. Just one line of R code automatically generates 165 lines of HTML and JavaScript for us. This "lightning talk" was presented at the July 2011 meeting of the Greater Boston useR meeting.
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Databricks
Graph data and graph analytics are increasingly important in data science and engineering. Cypher is an open language used for querying and updating graph databases and analytics platforms, which is now available in the Apache Spark environment. Neo4j Morpheus leverages the open source graph language project to integrate data from Neo4j operational graph databases with Hive and JDBC SQL data sources, using new Cypher features like the Property Graph Catalog, named graphs, graph projection, parameterized graph view functions, and graph/table views. Input and output graphs can be loaded and stored as structured collections of DataFrames with strong graph schemas to ensure data consistency and graph query optimization. Property graphs can also be analyzed and transformed using graph algorithms such as those in the GraphFrames project. Besides describing and demonstrating these capabilities, this talk also discusses the Spark Project Improvement Proposal to bring Cypher into Spark 3.0, and outlines current work to unify Cypher with other graph query languages to form a new ISO standard Graph Query Language.
Speakers: Alastair Green, Martin Junghanns
Query optimizers and people have one thing in common: the better they understand their data, the better they can do their jobs. Optimizing queries is hard if you don't have good estimates for the sizes of the intermediate join and aggregate results. Data profiling is a technique that scans data, looking for patterns within the data such as keys, functional dependencies, and correlated columns. These richer statistics can be used in Apache Calcite's query optimizer, and the projects that use it, such as Apache Hive, Phoenix and Drill. We describe how we built a data profiler as a table function in Apache Calcite, review the recent research and algorithms that made it possible, and show how you can use the profiler to improve the quality of your data.
A talk given by Julian Hyde at DataWorks Summit, San Jose, on June 14th 2017.
GeoMesa on Apache Spark SQL with Anthony FoxDatabricks
GeoMesa is an open-source toolkit for processing and analyzing spatio-temporal data, such as IoT and sensor-produced observations, at scale. It provides a consistent API for querying and analyzing data on top of distributed databases (e.g. HBase, Accumulo, Bigtable, Cassandra) and messaging networks (e.g. Kafka) to handle batch analysis of historical archives of data and low-latency processing of data in-stream.
GeoMesa has deep integration with Spark SQL. It has added spatial types (e.g. Point, LineString, Polygons), spatial predicates (st_contains, st_intersects, etc.), and geometry processing functions (e.g. st_buffer, st_convexHull, etc.) to Spark SQL. It also optimizes the processing of these extensions by integrating with the Catalyst SQL optimizer to intercept SQL statements with spatial predicates and provision RDDs based on the underlying spatial index.
This session will describe the implementation of the GeoMesa Spark SQL integration, illustrate its application in production systems and demonstrate spatial aggregations and analytics using map-based visualizations.
Overview of JD Long's experimental "segue" package which marshals and manages Hadoop clusters (for non-Big Data problems) with Amazon's Elastic MapReduce service.
Presented at the February 2011 meeting of the Greater Boston useR Group.
Query optimizers and people have one thing in common: the better they understand their data, the better they can do their jobs. Optimizing queries is hard if you don't have good estimates for the sizes of the intermediate join and aggregate results. Data profiling is a technique that scans data, looking for patterns within the data such as keys, functional dependencies, and correlated columns. These richer statistics can be used in Apache Calcite's query optimizer, and the projects that use it, such as Apache Hive, Phoenix and Drill. We describe how we built a data profiler as a table function in Apache Calcite, review the recent research and algorithms that made it possible, and show how you can use the profiler to improve the quality of your data.
A talk given by Julian Hyde at Apache: Big Data, Miami, on May 16th 2017.
Apache Spark - Key-Value RDD | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2sewz2m
This CloudxLab Key-Value RDD tutorial helps you to understand Key-Value RDD in detail. Below are the topics covered in this tutorial:
1) Spark Key-Value RDD
2) Creating Key-Value Pair RDDs
3) Transformations on Pair RDDs - reduceByKey(func)
4) Count Word Frequency in a File using Spark
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...StampedeCon
Learn how to model beyond traditional direct access in Apache Cassandra. Utilizing the DataStax platform to harness the power of Spark and Solr to perform search, analytics, and complex operations in place on your Cassandra data!
Data modeling is hard, especially in the world of distributed NoSQL stores. With relational databases, developers have tended to store normalized data and shape their query model around that structure. This can come back to bite you when it comes time to scale, as complex queries across dozens of tables begin to affect application performance. It’s common to find developers rethinking their data model as query latency increases under load.
With NoSQL stores, developers must consider their query patterns from the outset of application development, designing their data model to fit those patterns. A number of techniques, new and old, can be used to allow for maximum performance and scalability.
Topics covered will include: De-normalization, time boxing, conflict resolution, and convergent & commutative replicated data types. Additionally, discussions of common query patterns in light of the capabilities of various NoSQL data stores will be reviewed.
Goal Based Data Production with Sim SimeonovDatabricks
Since the invention of SQL and relational databases, data production has been about specifying how data should be transformed through queries. While Apache Spark can certainly be used as a general distributed SQL-like query engine, the power and granularity of Spark’s APIs allows for a fundamentally different, and far more productive, approach. This session will introduce the principles of goal-based data production with examples ranging from ETL, to exploratory data analysis, to feature engineering for machine learning.
Goal-based data production concerns itself with specifying WHAT the desired result is, leaving the details of HOW the result is achieved to the smart data warehouse running on top of Spark. That not only substantially increases productivity, but also significantly expands the audience that can work directly with Spark: from developers and data scientists to technical business users. With specific data and architecture patterns and live demos, this session will demonstrate how easy it is for any company to create its own smart data warehouse with Spark 2.x and gain the benefits of goal-based data production.
My name is Neta Barkay , and I'm a data scientist at LivePerson.
I'd like to share with you a talk I presented at the Underscore Scala community on "Efficient MapReduce using Scalding".
In this talk I reviewed why Scalding fits big data analysis, how it enables writing quick and intuitive code with the full functionality vanilla MapReduce has, without compromising on efficient execution on the Hadoop cluster. In addition, I presented some examples of Scalding jobs which can be used to get you started, and talked about how you can use Scalding's ecosystem, which includes Cascading and the monoids from Algebird library.
Read more & Video: https://connect.liveperson.com/community/developers/blog/2014/02/25/scalding-reaching-efficient-mapreduce
Apache Spark - Key Value RDD - Transformations | Big Data Hadoop Spark Tutori...CloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2sm5Ekd
This CloudxLab Key-Value RDD Transformations tutorial helps you to understand Key-Value RDD transformations in detail. Below are the topics covered in this tutorial:
1) Transformations on Key-Value Pair RDD - keys(), values(), groupByKey(), combineByKey(), sortByKey(), subtractByKey(), join(), leftOuterJoin(), rightOuterJoin(), cogroup(), countByKey() and lookup()
Move your data (Hans Rosling style) with googleVis + 1 line of R codeJeffrey Breen
R's googleVis package makes it easy to use the Google Visualization API with your data. Here we demonstrate how to create a Hans Rosling-style motion chart with some sample data. Just one line of R code automatically generates 165 lines of HTML and JavaScript for us. This "lightning talk" was presented at the July 2011 meeting of the Greater Boston useR meeting.
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Databricks
Graph data and graph analytics are increasingly important in data science and engineering. Cypher is an open language used for querying and updating graph databases and analytics platforms, which is now available in the Apache Spark environment. Neo4j Morpheus leverages the open source graph language project to integrate data from Neo4j operational graph databases with Hive and JDBC SQL data sources, using new Cypher features like the Property Graph Catalog, named graphs, graph projection, parameterized graph view functions, and graph/table views. Input and output graphs can be loaded and stored as structured collections of DataFrames with strong graph schemas to ensure data consistency and graph query optimization. Property graphs can also be analyzed and transformed using graph algorithms such as those in the GraphFrames project. Besides describing and demonstrating these capabilities, this talk also discusses the Spark Project Improvement Proposal to bring Cypher into Spark 3.0, and outlines current work to unify Cypher with other graph query languages to form a new ISO standard Graph Query Language.
Speakers: Alastair Green, Martin Junghanns
Graph Database Defined. A graph database is defined as a specialized, single-purpose platform for creating and manipulating graphs. Graphs contain nodes, edges, and properties, all of which are used to represent and store data in a way that relational databases are not equipped to do.
This presentation covers several aspects of modeling data and domains with a graph database like Neo4j. The graph data model allows high fidelity modeling. Using the first class relationships of the graph model allow to use much higher forms of normalization than you would use in a relational database.
Video here: https://vimeo.com/67371996
5th in the AskTOM Office Hours series on graph database technologies. https://devgym.oracle.com/pls/apex/dg/office_hours/3084
PGQL: A Query Language for Graphs
Learn how to query graphs using PGQL, an expressive and intuitive graph query language that's a lot like SQL. With PGQL, it's easy to get going writing graph analysis queries to the database in a very short time. Albert and Oskar show what you can do with PGQL, and how to write and execute PGQL code.
DevFest Istanbul - a free guided tour of Neo4JFlorent Biville
2013-11-02 : DevFest Türkiye, Istanbul.
Slightly modified version of my previous Neo4J introduction talk about Neo4J in Soft-Shake Event, Geneva, Switzerland.
An introduction to Graph databases and in particular Neo4j, including where Neo4j lives on the CAP Scale in relation to other databases, the Graph data model and a very quick introduction to the Cypher Query Language.
In this talk I will show Visualbox, a "visualization server" based on LODSPeaKr that can make easy for non javascript experts to create simple but meaningful visualizations.
Graph Databases in the Microsoft EcosystemMarco Parenzan
With SQL Server and Cosmos Db we now have graph databases broadly available, after being studied for decades in Db theory, or being a niche approach in Open Source with Neo4J. And then there are services like Microsoft Graph and Azure Digital Twins that give us vertical implementations of graph. So let's make a walkaround of graphs in the MIcrosoft ecosystem.
Similar to 3rd Athens Big Data Meetup - 2nd Talk - Neo4j: The World's Leading Graph DB (20)
22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...Athens Big Data
Title: MLOps Workshop: The Full ML Lifecycle - How to Use ML in Production
Speakers: Spyros Cavadias (https://www.linkedin.com/in/spyros-cavadias/), Konstantinos Pittas (https://www.linkedin.com/in/konstantinos-pittas-83310270/), Thanos Gkinakos (https://www.linkedin.com/in/thanos-gkinakos-03582a128/)
Date: Saturday, December 17, 2022
Event: https://www.meetup.com/athens-big-data/events/289927468/
21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage systemAthens Big Data
Title: Dive into ClickHouse storage system
Speaker: Alexander Sapin (https://github.com/alesapin/)
Date: Thursday, March 5, 2020
Event: https://meetup.com/Athens-Big-Data/events/268379195/
19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...Athens Big Data
Title: NLP: From news recommendation to word prediction
Speaker: Mihalis Papakonstantinou (https://linkedin.com/in/mihalispapak/)
Date: Thursday, December 12, 2019
Event: https://meetup.com/Athens-Big-Data/events/266762191/
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...Athens Big Data
Title: Fast and simple data exploration with ClickHouse
Speaker: Alexander Kuzmenkov (https://github.com/akuzm/)
Date: Thursday, March 5, 2020
Event: https://meetup.com/Athens-Big-Data/events/268379195/
20th Athens Big Data Meetup - 2nd Talk - Druid: under the coversAthens Big Data
Title: Druid: under the covers
Speaker: Peter Marshall (https://linkedin.com/in/amillionbytes/)
Date: Tuesday, January 28, 2020
Event: https://meetup.com/Athens-Big-Data/events/266900242/
20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...Athens Big Data
Title: Druid: the open source, performant, real-time, analytical datastore
Speaker: Peter Marshall (https://linkedin.com/in/amillionbytes/)
Date: Tuesday, January 28, 2020
Event: https://meetup.com/Athens-Big-Data/events/266900242/
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on KubernetesAthens Big Data
Title: Run Spark and Flink Jobs on Kubernetes
Speaker: Chaoran Yu (https://linkedin.com/in/chaoran-yu-97b1144a/)
Date: Thursday, November 14, 2019
Event: https://meetup.com/Athens-Big-Data/events/265957761/
18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a ServiceAthens Big Data
Title: Timeseries Forecasting as a Service
Speaker: Thanassis Spyrou (https://linkedin.com/in/thanassis-spyrou-92911959/)
Date: Thursday, November 14, 2019
Event: https://meetup.com/Athens-Big-Data/events/265957761/
17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...Athens Big Data
Title: Data Flow Building and Calculation Pipelines via PySpark and ML Modeling via Python
Speaker: Theodoros Michalareas (https://linkedin.com/in/theodorosmichalareas/)
Date: Tuesday, September 24, 2019
Event: https://meetup.com/Athens-Big-Data/events/264702584/
16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...Athens Big Data
Title: A Focus on Building and Optimizing ML Models with Amazon SageMaker
Speaker: Pavlos Mitsoulis-Ntompos (https://linkedin.com/in/pavlosmitsoulis/)
Date: Thursday, March 14, 2019
Event: https://www.meetup.com/Athens-Big-Data/events/259091496/
16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...Athens Big Data
Title: An Introduction to Machine Learning with Python and Scikit-Learn
Speaker: Julien Simon (https://linkedin.com/in/juliensimon/)
Date: Thursday, March 14, 2019
Event: https://www.meetup.com/Athens-Big-Data/events/259091496/
15th Athens Big Data Meetup - 1st Talk - Running Spark On MesosAthens Big Data
Title: Running Spark On Mesos
Speaker: Mr. Chris Sidiropoulos (https://linkedin.com/in/chris-sidiropoulos-2a6b156a//)
Date: Tuesday, November 27, 2018
Event: https://meetup.com/Athens-Big-Data/events/256098657/
5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...Athens Big Data
Title: Real-Time Training and Deploying Spark ML Recommendations With Kafka and NetflixOSS
Speaker: Chris Fregly (https://linkedin.com/in/cfregly/)
Date: Monday, October 17, 2016
Event: https://meetup.com/Athens-Big-Data/events/234546355/
13th Athens Big Data Meetup - 2nd Talk - Training Neural Networks With Enterp...Athens Big Data
Title: Training Neural Networks With Enterprise Relational Data
Speaker: Dr. Christos Malliopoulos (https://linkedin.com/in/cmalliopoulos//)
Date: Thursday, June 21, 2018
Event: https://meetup.com/Athens-Big-Data/events/251411957/
11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...Athens Big Data
Title: Beyond Bitcoin; Blockchain Technologies And How They Disrupt Every Industry
Speaker: Mr. Dimitrios Kouzis-Loukas (https://linkedin.com/in/lookfwd//)
Date: Wednesday, December 27, 2017
Event: https://meetup.com/Athens-Big-Data/events/246033156/
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
3rd Athens Big Data Meetup - 2nd Talk - Neo4j: The World's Leading Graph DB
1. The world's leading graph DB
Georgios Eleftheriadis
Software/Database Engineer
2. What is NOSQL?
It’s not “No to SQL”
It’s not “Never SQL”
It’s “Not Only SQL” as they may support SQL-like query languages
NOSQL describes ongoing trend where developers increasingly opt for non-relational
databases to help solve their problems, in an effort to use the right tool for the right job.
NOSQL example databases
Document Oriented (CouchDB, MongoDB)
Key-Value (Memcached, Redis)
Graph Database (Neo4J, InfiniteGraph)
Multi-model (ArangoDB, OrientDB)
2
3. Graphs are everywhere
Relationships in
Politics, Economics, History, Science, Transportation
Biology, Chemistry, Physics, Sociology
Body, Ecosphere, Reaction, Interactions
Internet
Hardware, Software, Interaction
Social Networks
Family, Friends
Work, Communities
Neighbors, Cities, Society
3
4. A sample social graph
# persons query time
Relational database 1,000 2000ms
Neo4j 1,000 2ms
Neo4j 1,000,000 2ms
with ~1,000 persons
average 50 friends per person
pathExists(a,b) limited to depth 4
caches warmed up to eliminate disk I/O
4
5. A sample social graph
# persons query time
Relational database 1,000 2000ms
Neo4j 1,000 2ms
Neo4j 1,000,000 2ms
with ~1,000 persons
average 50 friends per person
pathExists(a,b) limited to depth 4
caches warmed up to eliminate disk I/O
5
6. A sample social graph
# persons query time
Relational database 1,000 2000ms
Neo4j 1,000 2ms
Neo4j 1,000,000 2ms
with ~1,000 persons
average 50 friends per person
pathExists(a,b) limited to depth 4
caches warmed up to eliminate disk I/O
6
7. SQL VS Cypher
MATCH (keanu:Person { name: 'Keanu Reeves' })-[:ACTED_IN]->(movie:Movie),
(director:Person)-[:DIRECTED]->(movie)
RETURN director.name, count(*)
ORDER BY count(*) DESC
7
SELECT director.name, count(*) FROM person Keanu
JOIN acted_in ON keanu.id = acted_in.person_id
JOIN directed ON acted_in.movie_id = directed.movie_id
JOIN person AS director ON directed.person_id = director.id
WHERE keanu.name = 'Keanu Reeves‘
GROUP BY director.name ORDER BY count(*) DESC
Now let’s find out a bit about the directors in movies that Keanu Reeves acted in. We want to
know how many of those movies each of them directed.
8. Two Ways to Work with Neo4j
Embeddable on JVM
Java
Jruby
Scala
Tomcat
Rails
Server with REST API
every language on the planet
flexible deployment scenarios
DIY server or cloud managed
Embedded capability == Server capability
same scalability, transactionality, and availability
8
9. Pros & Cons with Neo4j Database
Pros
Most of the data is connected
High performance to access connected
data
Least or no impact if changes to the data
model
Can be mixed with JPA with Spring Data
Neo4j
Integrate with Spring
Cons
Could not reuse SQL queries
Migration could be pain
Not effective to use Neo4j if data is not
connected
9
10. Enterprise VS Community Edition
Features Enterprise Community
Property Graph Model YES YES
Native Graph Processing & Storage YES YES
Cypher – Graph Query Language YES YES
Language Drivers YES YES
REST & High-Performance Native API YES YES
Enterprise Lock Manager YES NO
Cache Sharding YES NO
Clustered Replication YES NO
Hot Backups YES NO
Advanced Monitoring YES NO 10
12. Cypher Query Language
Declarative query language
Describe what you want, not how
Based on pattern matching
12
13. CQL MATCH
MATCH (a)-->(b)
RETURN a, b;
MATCH (a)-->()
RETURN a.name;
MATCH (n)-[r]->(m)
RETURN n, r, m;
MATCH (a)-[r]->()
RETURN id(a), labels(a), keys(a), type(r);
MATCH (a)-[r:ACTED_IN]->(m)
RETURN a.name, r.roles, m.title;
13
14. CQL MATCH
MATCH (a)-[:ACTED_IN]->(m)<-[:DIRECTED]-(d)
RETURN a.name, m.title, d.name;
MATCH (a)-[:ACTED_IN]->(m)<-[:DIRECTED]-(d)
RETURN a.name AS actor, m.title AS movie, d.name AS director;
MATCH (a)-[:ACTED_IN]->(m)
OPTIONAL MATCH (d)-[:DIRECTED]->(m)
RETURN a.name, m.title, d.name;
MATCH (a)-[:ACTED_IN]->(m), (d)-[:DIRECTED]->(m)
RETURN a.name, m.title, d.name;
MATCH p=(a)-[:ACTED_IN]->(m)<-[:DIRECTED]-(d)
RETURN nodes(p);
14
15. CQL MATCH Functions
count(x) - add up the number of occurrences
min(x) - get the lowest value
max(x) - get the highest value
avg(x) - get the average of a numeric value
collect(x) - collected all the occurrences into an array
15
16. CQL WHERE
MATCH (tom{name:"Tom Hanks"})-[:ACTED_IN]->(movie)
WHERE movie.released < 1992
RETURN DISTINCT movie.title;
MATCH (actor{name:"Keanu Reeves"})-[r:ACTED_IN]->(movie)
WHERE "Neo" IN r.roles
RETURN DISTINCT movie.title;
MATCH (tom{name:"Tom Hanks"})-[:ACTED_IN]->(movie)<-[:ACTED_IN]-(a)
WHERE a.born < tom.born
RETURN DISTINCT a.name, (tom.born - a.born) AS diff;
MATCH (kevin {name:"Kevin Bacon"})-[:ACTED_IN]->(movie)
RETURN DISTINCT movie.title;
MATCH (kevin)-[:ACTED_IN]->(movie)
WHERE kevin.name =~ '.*Kevin.*‘ // Regular expressions
RETURN DISTINCT movie.title;
16
17. CQL WHERE
MATCH (gene {name:"Gene Hackman"})-[:ACTED_IN]->(movie)<-[:ACTED_IN]-(n)
WHERE (n)-[:DIRECTED]->()
RETURN DISTINCT n.name;
MATCH (a)-[:ACTED_IN]->()
RETURN a.name, count(*) AS count
ORDER BY count DESC LIMIT 5;
MATCH (keanu {name:"Keanu Reeves"})-[:ACTED_IN]->()<-[:ACTED_IN]-(c),
(c)-[:ACTED_IN]->()<-[:ACTED_IN]-(coc)
WHERE NOT((keanu)-[:ACTED_IN]->()<-[:ACTED_IN]-(coc)) AND coc <> keanu
RETURN coc.name, count(coc)
ORDER BY count(coc) DESC LIMIT 3;
// Recommend 5 actors that Keanu Reeves should work with (but hasn't)
17
18. CQL CREATE
CREATE ({title:"Mystic River", released:1993});
MATCH (movie {title:"Mystic River"})
SET movie.tagline = "We bury our sins here, Dave. We wash them clean."
RETURN movie;
MATCH (kevin {name:"Kevin Bacon"}),(movie {title:"Mystic River"})
CREATE UNIQUE (kevin)-[:ACTED_IN {roles:["Sean"]}]->(movie);
MATCH (kevin {name:"Kevin Bacon"})-[r:ACTED_IN]->(movie {title:"Mystic River"})
SET r.roles = ["Sean Devine"]
RETURN r.roles;
MATCH (clint {name:"Clint Eastwood"})-[r:ACTED_IN]->(movie {title:"Mystic River"})
CREATE UNIQUE (clint)-[:DIRECTED]->(movie);
18
19. CQL DELETE
MATCH (matrix {title:"The Matrix"})<-[r:ACTED_IN]-(a)
WHERE "Emil" IN r.roles
RETURN a;
// Emil Eifrem is CEO of Neo Technology and co-founder of the Neo4j project
MATCH (emil{name:"Emil Eifrem"})
DELETE emil;
MATCH (emil{name:"Emil Eifrem"}) -[r]-()
DELETE r;
MATCH (emil{name:"Emil Eifrem"}) -[r]-()
DELETE r, emil;
MATCH (node) where ID(node)=1
OPTIONAL MATCH (node)-[r]-()
DELETE r, node;
19
20. CQL INDEXES
There is usually no need to specify which indexes to use in a query, Cypher will figure that out by itself. Indexes
are also automatically used for equality comparisons and inequality (range) comparisons of an indexed
property in the WHERE clause. USING is used to influence the decisions of the planner when building an
execution plan for a query.
CREATE INDEX ON :Person(name)
MATCH (person:Person { name: 'Keanu Reeves' })
RETURN person
MATCH (person:Person)
WHERE person.name = 'Keanu Reeves'
RETURN person
MATCH (person:Person)
WHERE person.name > 'Keanu'
RETURN person
MATCH (person:Person)
WHERE person.name STARTS WITH 'Kea' // CONTAINS, ENDS WITH
RETURN person
20
23. CQL Do-It-Yourself
Add KNOWS relationships between all actors who were in the same movie
23
MATCH (a)-[:ACTED_IN]->()<-[:ACTED_IN]-(b)
CREATE UNIQUE (a)-[:KNOWS]->(b);
MATCH (a)-[:ACTED_IN|DIRECTED]->()<-[:ACTED_IN|DIRECTED]-(b)
CREATE UNIQUE (a)-[:KNOWS]->(b);
24. CQL Useful Tricks
Find friends of friends
MATCH (keanu{name:"Keanu Reeves"})-[:KNOWS*2]->(fof)
WHERE NOT((keanu)-[:KNOWS]-(fof))
RETURN DISTINCT fof.name;
Find shortest path
MATCH p=shortestPath(
(charlize{name:"Charlize Theron"})-[:KNOWS*]->(bacon{name:"Kevin Bacon"}))
RETURN length(rels(p));
Return the names of the people joining Charlize to Kevin.
MATCH p=shortestPath(
(charlize{name:"Charlize Theron"})-[:KNOWS*]->(bacon{name:"Kevin Bacon"}))
RETURN extract(n IN nodes(p)| n.name) AS names
24
25. CQL Useful Tricks
Find movies and actors up to 4 "hops" away from Kevin Bacon
MATCH (bacon:Person {name:"Kevin Bacon"})-[*1..4]-(hollywood)
RETURN DISTINCT Hollywood
Find someone to introduce Tom Hanks to Tom Cruise
MATCH (tom:Person {name:"Tom Hanks"})-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors),
(coActors)-[:ACTED_IN]->(m2)<-[:ACTED_IN]-(cruise:Person {name:"Tom Cruise"})
RETURN tom, m, coActors, m2, cruise
25
26. Social Network Database
Relationship Type Properties
IS_FRIEND since
LIKES
WROTE_COMMENT
HAS_COMMENT
UPLOADED
SENT_MESSAGE
RECEIVED_MESSAGE
TAGGED_IN
Node Labels Properties
Person name, email, dob
Photo caption, date
Status text, date
Comment text, date
Message Text, date
26