The slides will walk thru the requirements and will enable to pick appropriate No SQL database as a best match for the requirements i.e. fitness for the purpose
In this session, we will discuss DataGrid usage and integration patterns that go beyond the typical Cache#put and Cache#get. Attendees will share their own experiences and we will discuss:
• How DataGrid can be used with ESB to provide a reliable and fault tolerant SEDA implementation (based on Mule);
• How a DataGrid can be used to execute dynamic jobs ala MapReduce using scripting languages;
• How a DataGrid can be used to scale Lucene index storage, as well as use Compass to index the DataGrid.
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDBHBaseCon
This case study involves analysis of high-volume, continuous time-series aviation data from jet engines that consist of temperature, pressure, vibration and related parameters from the on-board sensors, joined with well-characterized slowly changing engine asset configuration data and other enterprise data for continuous engine diagnostics and analytics. This data is ingested via distributed fabric comprising transient containers, message queues and a columnar, compressed storage leveraging OpenTSDB over Apache HBase.
When it comes to data security, Uber’s business has unique needs related to scale, use-case, and technical stacks. This talk will discuss how our data platform team addressed specific challenges in deploying Uber's security requirements for Apache Hadoop, including how we leveraged open source building blocks. We'll share insights on how we augmented our Kerberized Hadoop integration with additional authentications mechanisms as well as our approach to supporting custom authentication in Apache Knox. In particular, we will elaborate Uber’s contributions to Apache Knox, specifically a novel pluggable platform for custom validation of any user request. This talk will also cover how we address table, column, and partition-level access control while ensuring improved developer productivity. In particular, we will explain how we translate RBAC policy into HDFS ACL to control data access, our internal audit platform built to detect and analyze the common security infringements, and real-world examples from our experiences in production.
Speakers
Mohammad Islam, Staff Software Engineer, Uber
Wei Han, Manager, Uber
From the 2017 HPCC Systems Community Day:
Gavin Halliday shares the latest features and functionality for what's coming in the next release.
Gavin Halliday
Enterprise Architect, LexisNexis
Gavin is a Lead Enterprise Architect for LexisNexis Risk Solutions and one of the original developers for the HPCC Systems platform, having responsibility for the ECL Compiler.
Gavin holds a Physics degree from the University of Oxford and resides in the UK.
In this session, we will discuss DataGrid usage and integration patterns that go beyond the typical Cache#put and Cache#get. Attendees will share their own experiences and we will discuss:
• How DataGrid can be used with ESB to provide a reliable and fault tolerant SEDA implementation (based on Mule);
• How a DataGrid can be used to execute dynamic jobs ala MapReduce using scripting languages;
• How a DataGrid can be used to scale Lucene index storage, as well as use Compass to index the DataGrid.
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDBHBaseCon
This case study involves analysis of high-volume, continuous time-series aviation data from jet engines that consist of temperature, pressure, vibration and related parameters from the on-board sensors, joined with well-characterized slowly changing engine asset configuration data and other enterprise data for continuous engine diagnostics and analytics. This data is ingested via distributed fabric comprising transient containers, message queues and a columnar, compressed storage leveraging OpenTSDB over Apache HBase.
When it comes to data security, Uber’s business has unique needs related to scale, use-case, and technical stacks. This talk will discuss how our data platform team addressed specific challenges in deploying Uber's security requirements for Apache Hadoop, including how we leveraged open source building blocks. We'll share insights on how we augmented our Kerberized Hadoop integration with additional authentications mechanisms as well as our approach to supporting custom authentication in Apache Knox. In particular, we will elaborate Uber’s contributions to Apache Knox, specifically a novel pluggable platform for custom validation of any user request. This talk will also cover how we address table, column, and partition-level access control while ensuring improved developer productivity. In particular, we will explain how we translate RBAC policy into HDFS ACL to control data access, our internal audit platform built to detect and analyze the common security infringements, and real-world examples from our experiences in production.
Speakers
Mohammad Islam, Staff Software Engineer, Uber
Wei Han, Manager, Uber
From the 2017 HPCC Systems Community Day:
Gavin Halliday shares the latest features and functionality for what's coming in the next release.
Gavin Halliday
Enterprise Architect, LexisNexis
Gavin is a Lead Enterprise Architect for LexisNexis Risk Solutions and one of the original developers for the HPCC Systems platform, having responsibility for the ECL Compiler.
Gavin holds a Physics degree from the University of Oxford and resides in the UK.
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...Databricks
AI is fundamentally transforming how we live and work.
Zalando is a data driven company. We deliver an optimal customer experience that drives engagement. We continue to improve this experience by leveraging the latest technologies and machine learning techniques — such as building a cutting edge cloud based infrastructure to support our operations at scale.
We provide our data scientists across Zalando with the means to implement artificial intelligence use cases, leveraging data from all parts of our company and the best machine learning techniques from across the industry. Apache Spark delivered through Databricks is at the core of this strategy.
In this keynote, I’ll share our AI journey thus far, and share how we are exploring ways to unify data through A.I. with Spark and Databricks.
As HBase and Hadoop continue to become routine across enterprises, these enterprises inevitably shift priorities from effective deployments to cost-efficient operations. Consolidation of infrastructure, the sum of hardware, software, and system-administrator effort, is the most common strategy to reduce costs. As a company grows, the number of business organizations, development teams, and individuals accessing HBase grows commensurately, creating a not-so-simple requirement: HBase must effectively service many users, each with a variety of use-cases. This is problem is known as multi-tenancy. While multi-tenancy isn’t a new problem, it also isn’t a solved one, in HBase or otherwise. This talk will present a high-level view of the common issues organizations face when multiple users and teams share a single HBase instance and how certain HBase features were designed specifically to mitigate the issues created by the sharing of finite resources.
Overview of big data technologies like Hadoop, Hive, Pig, HDFS, Map Reduce, Spark and example architectures for designing big data products and platforms.
Data Science Languages and Industry AnalyticsWes McKinney
September 19, 2015 talk at Berkeley Institute for Data Science. On how comparatively poor JSON / structured data tools pose a challenge for the data science languages (Python, R, Julia, etc.).
Move your on prem data to a lake in a Lake in CloudCAMMS
With the boom in data; the volume and its complexity, the trend is to move data to the cloud. Where and How do we do this? Azure gives you the answer. In this session, I will give you an introduction to Azure Data Lake and Azure Data Factory, and why they are good for the type of problem we are talking about. You will learn how large datasets can be stored on the cloud, and how you could transport your data to this store. The session will briefly cover Azure Data Lake as the modern warehouse for data on the cloud,
The session covers how to get started to build big data solutions in Azure. Azure provides different Hadoop clusters for Hadoop ecosystem. The session covers the basic understanding of HDInsight clusters including: Apache Hadoop, HBase, Storm and Spark. The session covers how to integrate with HDInsight in .NET using different Hadoop integration frameworks and libraries. The session is a jump start for engineers and DBAs with RDBMS experience who are looking for a jump start working and developing Hadoop solutions. The session is a demo driven and will cover the basics of Hadoop open source products.
The presentation covers how to get started to build big data solutions in Azure. Azure provides different Hadoop clusters for Hadoop ecosystem. The session covers the basic understanding of HDInsight clusters including: Apache Hadoop, HBase, Storm and Spark. The session covers how to integrate with HDInsight in .NET using different Hadoop integration frameworks and libraries. The session is a jump start for engineers and DBAs with RDBMS experience who are looking for a jump start working and developing Hadoop solutions. The session is a demo driven and will cover the basics of Hadoop open source products.
HBase from the Trenches - Phoenix Data Conference 2015Avinash Ramineni
Apache HBase has been widely adopted at many enterprises. In this talk we will cover a few war stories with troubleshooting, tuning and fixing problems with HBase Cluster. We will be covering some of the best practices, tools , utilities and lessons learnt from evaluating deployments at different organizations
SQL on Hadoop
Looking for the correct tool for your SQL-on-Hadoop use case?
There is a long list of alternatives to choose from; how to select the correct tool?
The tool selection is always based on use case requirements.
Read more on alternatives and our recommendations.
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...Databricks
AI is fundamentally transforming how we live and work.
Zalando is a data driven company. We deliver an optimal customer experience that drives engagement. We continue to improve this experience by leveraging the latest technologies and machine learning techniques — such as building a cutting edge cloud based infrastructure to support our operations at scale.
We provide our data scientists across Zalando with the means to implement artificial intelligence use cases, leveraging data from all parts of our company and the best machine learning techniques from across the industry. Apache Spark delivered through Databricks is at the core of this strategy.
In this keynote, I’ll share our AI journey thus far, and share how we are exploring ways to unify data through A.I. with Spark and Databricks.
As HBase and Hadoop continue to become routine across enterprises, these enterprises inevitably shift priorities from effective deployments to cost-efficient operations. Consolidation of infrastructure, the sum of hardware, software, and system-administrator effort, is the most common strategy to reduce costs. As a company grows, the number of business organizations, development teams, and individuals accessing HBase grows commensurately, creating a not-so-simple requirement: HBase must effectively service many users, each with a variety of use-cases. This is problem is known as multi-tenancy. While multi-tenancy isn’t a new problem, it also isn’t a solved one, in HBase or otherwise. This talk will present a high-level view of the common issues organizations face when multiple users and teams share a single HBase instance and how certain HBase features were designed specifically to mitigate the issues created by the sharing of finite resources.
Overview of big data technologies like Hadoop, Hive, Pig, HDFS, Map Reduce, Spark and example architectures for designing big data products and platforms.
Data Science Languages and Industry AnalyticsWes McKinney
September 19, 2015 talk at Berkeley Institute for Data Science. On how comparatively poor JSON / structured data tools pose a challenge for the data science languages (Python, R, Julia, etc.).
Move your on prem data to a lake in a Lake in CloudCAMMS
With the boom in data; the volume and its complexity, the trend is to move data to the cloud. Where and How do we do this? Azure gives you the answer. In this session, I will give you an introduction to Azure Data Lake and Azure Data Factory, and why they are good for the type of problem we are talking about. You will learn how large datasets can be stored on the cloud, and how you could transport your data to this store. The session will briefly cover Azure Data Lake as the modern warehouse for data on the cloud,
The session covers how to get started to build big data solutions in Azure. Azure provides different Hadoop clusters for Hadoop ecosystem. The session covers the basic understanding of HDInsight clusters including: Apache Hadoop, HBase, Storm and Spark. The session covers how to integrate with HDInsight in .NET using different Hadoop integration frameworks and libraries. The session is a jump start for engineers and DBAs with RDBMS experience who are looking for a jump start working and developing Hadoop solutions. The session is a demo driven and will cover the basics of Hadoop open source products.
The presentation covers how to get started to build big data solutions in Azure. Azure provides different Hadoop clusters for Hadoop ecosystem. The session covers the basic understanding of HDInsight clusters including: Apache Hadoop, HBase, Storm and Spark. The session covers how to integrate with HDInsight in .NET using different Hadoop integration frameworks and libraries. The session is a jump start for engineers and DBAs with RDBMS experience who are looking for a jump start working and developing Hadoop solutions. The session is a demo driven and will cover the basics of Hadoop open source products.
HBase from the Trenches - Phoenix Data Conference 2015Avinash Ramineni
Apache HBase has been widely adopted at many enterprises. In this talk we will cover a few war stories with troubleshooting, tuning and fixing problems with HBase Cluster. We will be covering some of the best practices, tools , utilities and lessons learnt from evaluating deployments at different organizations
SQL on Hadoop
Looking for the correct tool for your SQL-on-Hadoop use case?
There is a long list of alternatives to choose from; how to select the correct tool?
The tool selection is always based on use case requirements.
Read more on alternatives and our recommendations.
A talk given by Ted Dunning on February 2013 on Apache Drill, an open-source community-driven project to provide easy, dependable, fast and flexible ad hoc query capabilities.
Innovation with Connection, The new HPCC Systems Plugins and ModulesHPCC Systems
As part of the 2018 HPCC Systems Summit Community Day event:
The HPCC Systems platform team continues to expand interoperability with third party systems, which increases the platform feature-set and facilitates custom solutions. James will share an update on the latest connectors available, including the Spark-HPCC, and the upcoming HDFS connector plugin.
James McMullan has a broad range Software Engineering experience from developing low level system drivers for X-Ray fluorescence equipment to mobile video games and web applications. He is a recent addition to the Lexis Nexis team and is part of the HPCC Systems Platform team where he has been working on connectors integrating HPCC Systems with the Spark & Hadoop ecosystems.
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Lucas Jellema
This presentation gives an brief overview of the history of relational databases, ACID and SQL and presents some of the key strentgths and potential weaknesses. It introduces the rise of NoSQL - why it arose, what is entails, when to use it. The presentation focuses on MongoDB as prime example of NoSQL document store and it shows how to interact with MongoDB from JavaScript (NodeJS) and Java.
The initiation of The Hadoop Apache Hive began in 2007 by Facebook due to its data growth.
This ETL system began to fail over few years as more people joined Facebook.
In August 2008, Facebook decided to move to scalable a more scalable open-source Hadoop environment; Hive
Facebook, Netflix and Amazons support the Apache Hive SQL now known as the HiveQL
Azure Cosmos DB: Features, Practical Use and Optimization "GlobalLogic Ukraine
This presentation is dedicated to Azure Cosmos DB, it's history, characteristics, tasks and solutions. The presentation deals with performance optimization, practical experience of usage and an overview of the news about Cosmos DB from Microsoft Build 2017 conference (https://build.microsoft.com).
This presentation by Andriy Gorda (Engineering Manager & Lead Software Engineer, Consultant, GlobalLogic Kharkiv) was delivered at GlobalLogic Kharkiv MS TechTalk on June 13, 2017.
Engineering patterns for implementing data science models on big data platformsHisham Arafat
Discussion of practically implementing data science models on big data platforms from engineering perspective. An eye opener on the engineering factors associated with designing and working solution. We use a simple text mining example on social media analytics for brand marketing. At the first while, it seems simple solution however if you go deeply and think on implementation aspects of even a simple analytics model, you can discover the degree of complexity at each part of the solution. An Abstraction of the Big Data key advantages would be very helpful to select appropriate Big Data technology components out of very large landscape. Two examples with reference are given for using Lambda Architecture and unusual way of image processing using Big Data abstraction provided.
Summary of recent progress on Apache Drill, an open-source community-driven project to provide easy, dependable, fast and flexible ad hoc query capabilities.
An RDX Insights Series Presentation that analyzes the most significant areas of database vendor competition. Competitive evaluations include public vs private cloud, the three leading public cloud offerings, NoSQL vs relational, open source vs commercial and the traditional DBMS vendors vs all competitors.
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Fwdays
We will start from understanding how Real-Time Analytics can be implemented on Enterprise Level Infrastructure and will go to details and discover how different cases of business intelligence be used in real-time on streaming data. We will cover different Stream Data Processing Architectures and discus their benefits and disadvantages. I'll show with live demos how to build Fast Data Platform in Azure Cloud using open source projects: Apache Kafka, Apache Cassandra, Mesos. Also I'll show examples and code from real projects.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
3. Top Considerations For NoSQL Databases
•
•
•
•
•
•
•
•
Data Model
Query Model
Consistent Model
APIs
Scalability
HA/DR
Operational Cost
Commercial Support and Community Strength
4. Data Model
• Document Model
– MongoDB and CouchDB.
• Key-Value Model
– Riak and Redis (Key-Value);
• Column Model
– HBase and Cassandra (Wide Column).
• Graph Model
– Neo4j and HyperGraphDB
5. Query Model
• Document Database
– Indexing options
• Compound indexes, sparse indexes, time to live (TTL)
indexes
– Query options
• RegEx, GT, LT, EQ
• Key Value
– Indexing options : Secondary Index
• Column Databases
– Indexing options : Secondary Index
7. • maturity of the API
APIs
– time and cost required to develop and maintain the
system
– easier to learn & use
– reduce the onboarding time
– provide direct interfaces to put and get the documents
or fields within documents
• Language support
• RESTful APIs
maintain multiple copies of the data for availability and scalability purposesconsistency of the data across copies is eventually consistent.Document data stores are ConsistentKey-value and wide Column stores are typically eventually consistentNeeds to handle conflicting updates; writes conflict Riak use vector clocks to determine the ordering of eventsCassandra assume the greatest timestamp is correct, hence writes tend to perform well but updates are trade-off