解讀雲端大數據新趨勢
2018-05-16 @ iThome Cloud Summit 2018
雲端運算、大數據、物聯網、人工智慧,這些熱門話題從 2008 年開始就陸續出現在媒體版面上。放眼過去十年 Apache Hadoop 技術在臺灣本土的應用,本次分享將為各位解讀這四個話題之間的關聯,並探討 Big Data Stack on the Cloud 背後的市場需求驅動力,最後分享 Big Data Stack on Kubernetes 的進展。
Real World Use Cases: Hadoop and NoSQL in ProductionCodemotion
"Real World Use Cases: Hadoop and NoSQL in Production" by Tugdual Grall.
What’s important about a technology is what you can use it to do. I’ve looked at what a number of groups are doing with Apache Hadoop and NoSQL in production, and I will relay what worked well for them and what did not. Drawing from real world use cases, I show how people who understand these new approaches can employ them well in conjunction with traditional approaches and existing applications. Thread Detection, Datawarehouse optimization, Marketing Efficiency, Biometric Database are some examples exposed during this presentation.
Analysis of historical movie data by BHADRABhadra Gowdra
Recommendation system provides the facility to understand a person's taste and find new, desirable content for them automatically based on the pattern between their likes and rating of different items. In this paper, we have proposed a recommendation system for the large amount of data available on the web in the form of ratings, reviews, opinions, complaints, remarks, feedback, and comments about any item (product, event, individual and services) using Hadoop Framework.
Hadoop Basics - Apache hadoop Bigdata training by Design Pathshala Desing Pathshala
Learn Hadoop and Bigdata Analytics, Join Design Pathshala training programs on Big data and analytics.
This slide covers the basics of Hadoop and Big Data.
For training queries you can contact us:
Email: admin@designpathshala.com
Call us at: +91 98 188 23045
Visit us at: http://designpathshala.com
Join us at: http://www.designpathshala.com/contact-us
Course details: http://www.designpathshala.com/course/view/65536
Big data Analytics Course details: http://www.designpathshala.com/course/view/1441792
Business Analytics Course details: http://www.designpathshala.com/course/view/196608
Introduction To Big Data Analytics On Hadoop - SpringPeopleSpringPeople
48 hours of video are uploaded to YouTube every minute, resulting in nearly 8 years of content every day.
This is where comes the role of Big Data analytics so that huge amount of data can be maintained easily.
A brief introduction to Big Data Analytics On Hadoop.
Real World Use Cases: Hadoop and NoSQL in ProductionCodemotion
"Real World Use Cases: Hadoop and NoSQL in Production" by Tugdual Grall.
What’s important about a technology is what you can use it to do. I’ve looked at what a number of groups are doing with Apache Hadoop and NoSQL in production, and I will relay what worked well for them and what did not. Drawing from real world use cases, I show how people who understand these new approaches can employ them well in conjunction with traditional approaches and existing applications. Thread Detection, Datawarehouse optimization, Marketing Efficiency, Biometric Database are some examples exposed during this presentation.
Analysis of historical movie data by BHADRABhadra Gowdra
Recommendation system provides the facility to understand a person's taste and find new, desirable content for them automatically based on the pattern between their likes and rating of different items. In this paper, we have proposed a recommendation system for the large amount of data available on the web in the form of ratings, reviews, opinions, complaints, remarks, feedback, and comments about any item (product, event, individual and services) using Hadoop Framework.
Hadoop Basics - Apache hadoop Bigdata training by Design Pathshala Desing Pathshala
Learn Hadoop and Bigdata Analytics, Join Design Pathshala training programs on Big data and analytics.
This slide covers the basics of Hadoop and Big Data.
For training queries you can contact us:
Email: admin@designpathshala.com
Call us at: +91 98 188 23045
Visit us at: http://designpathshala.com
Join us at: http://www.designpathshala.com/contact-us
Course details: http://www.designpathshala.com/course/view/65536
Big data Analytics Course details: http://www.designpathshala.com/course/view/1441792
Business Analytics Course details: http://www.designpathshala.com/course/view/196608
Introduction To Big Data Analytics On Hadoop - SpringPeopleSpringPeople
48 hours of video are uploaded to YouTube every minute, resulting in nearly 8 years of content every day.
This is where comes the role of Big Data analytics so that huge amount of data can be maintained easily.
A brief introduction to Big Data Analytics On Hadoop.
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionCodemotion
What’s important about a technology is what you can use it to do. I’ve looked at what a number of groups are doing with Apache Hadoop and NoSQL in production, and I will relay what worked well for them and what did not. Drawing from real world use cases, I show how people who understand these new approaches can employ them well in conjunction with traditional approaches and existing applications. Thread Detection, Datawarehouse optimization, Marketing Efficiency, Biometric Database are some examples exposed during this presentation.
where enterprise solutions end and big data solutions start
i have a project for big data but what are roles of the team members?
High level architecture for big data solutions
Introduction to Big Data Hadoop Training Online by www.itjobzone.bizITJobZone.biz
Want to learn Hadoop online? This PPT give you Introduction to Big Data Hadoop Training Online by expert trainers at ITJobZone.biz - Start your Hadoop Online training with this Presentation.
Apache Spark & Cassandra use case at Telefónica Cbs by Antonio AlcacerStratio
Spark & Cassandra Use Case at Telefónica CyberSecurity (CBS) Antonio Alcocer antonio@stratio.com Oscar Mendez oscar@stratio.com @omendezsoto #CassandraSummit 2014 1
INFRASTRUCTURE LAYER
Database
Analytics
Bigdata
INFORMATION LAYER
MULTI CHANNEL DELIVERY
Dashboard
Laptop
Mobile/Tablet
Email
SMS
Print
ANALYTICS LAYER
Realtime
Near Realtime
Reports + Statistics
Custom Tools
Data Processing
system generated data
dimensional data
de/normalize data
Data Ingestion/Extraction
external data
reference internal data
discovery data
Data Loading
operational data
business information data
Architecture - High Level
5
Big data -ETL+BI
ERP
Flat Files
CRM
Live Stream
RDBMS
Web Services
Extract
Transform
Load
Massive
Parallel
Processing
Distributed System
noSQL DB
warehouse DB(OLAP)
search
engines
Business Intelligence
Web Services
Data
Science
Data Monetization
Data Exploration
Data Visualisation
ETL
BI
Data transaction/history -> Interaction -> Observation -> Trends -> Decisions
capture data -> process/index -> storage-> share -> search -> analytics -> visualise
6
CONSISTENCY
(quorum)
AVAILABILITY
PARTITIONING
RDBMS
HP Vertica(Columnar)
Cassandra (Columnar)
Dynamo (Key-Value)
Couchbase(Document)
Riak (Document)
HDFS
HBase (Columnar)
MongoDB (Document)
Redis (Key-Value)
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionCodemotion
What’s important about a technology is what you can use it to do. I’ve looked at what a number of groups are doing with Apache Hadoop and NoSQL in production, and I will relay what worked well for them and what did not. Drawing from real world use cases, I show how people who understand these new approaches can employ them well in conjunction with traditional approaches and existing applications. Thread Detection, Datawarehouse optimization, Marketing Efficiency, Biometric Database are some examples exposed during this presentation.
where enterprise solutions end and big data solutions start
i have a project for big data but what are roles of the team members?
High level architecture for big data solutions
Introduction to Big Data Hadoop Training Online by www.itjobzone.bizITJobZone.biz
Want to learn Hadoop online? This PPT give you Introduction to Big Data Hadoop Training Online by expert trainers at ITJobZone.biz - Start your Hadoop Online training with this Presentation.
Apache Spark & Cassandra use case at Telefónica Cbs by Antonio AlcacerStratio
Spark & Cassandra Use Case at Telefónica CyberSecurity (CBS) Antonio Alcocer antonio@stratio.com Oscar Mendez oscar@stratio.com @omendezsoto #CassandraSummit 2014 1
INFRASTRUCTURE LAYER
Database
Analytics
Bigdata
INFORMATION LAYER
MULTI CHANNEL DELIVERY
Dashboard
Laptop
Mobile/Tablet
Email
SMS
Print
ANALYTICS LAYER
Realtime
Near Realtime
Reports + Statistics
Custom Tools
Data Processing
system generated data
dimensional data
de/normalize data
Data Ingestion/Extraction
external data
reference internal data
discovery data
Data Loading
operational data
business information data
Architecture - High Level
5
Big data -ETL+BI
ERP
Flat Files
CRM
Live Stream
RDBMS
Web Services
Extract
Transform
Load
Massive
Parallel
Processing
Distributed System
noSQL DB
warehouse DB(OLAP)
search
engines
Business Intelligence
Web Services
Data
Science
Data Monetization
Data Exploration
Data Visualisation
ETL
BI
Data transaction/history -> Interaction -> Observation -> Trends -> Decisions
capture data -> process/index -> storage-> share -> search -> analytics -> visualise
6
CONSISTENCY
(quorum)
AVAILABILITY
PARTITIONING
RDBMS
HP Vertica(Columnar)
Cassandra (Columnar)
Dynamo (Key-Value)
Couchbase(Document)
Riak (Document)
HDFS
HBase (Columnar)
MongoDB (Document)
Redis (Key-Value)
Partner Ecosystem Showcase for Apache Ranger and Apache AtlasDataWorks Summit
The community for Apache Atlas and Apache Ranger, which are foundational components for Security and Governance across the Hadoop stack, has spawned a robust partner ecosystem of tools and platforms. These partner solutions build upon the extensibility offered in these platforms via open and robust APIs via integration patterns to provide innovative “better together” capabilities. In this talk, we will showcase how three of Hortonworks partners Talend, Protegrity, and Arcadia Data have effectively extended Apache Ranger and Apache Atlas frameworks to provide value added security and governance features to complement the Hadoop ecosystem. The talk will showcase partner-led demonstrations that will include how to enhance Apache Atlas lineage and metadata to cover ETL operations, how to build Apache Ranger authorizations of custom objects such as visualizations and how to enhance Apache Ranger’s data protection capabilities for encryption and masking. We will also provide a short overview of Hortonworks Gov Ready and Sec Ready programs and how partners can benefit from the certification process as part of this program.
時間:2018-02-10 台灣資料工程協會 2018 第一季技術工作坊
講題:使用普羅米修斯打造全棧式監控與告警平台
Building Full Stack Monitor and Notification with Prometheus
身為管理混合式雲端基礎建設的維運人員,面對分散在不同監控平台的數據是否感到頭疼呢?身為開發者,您是否苦於欠缺歷史監控數據來除錯或排查程式效能問題呢?本次分享將從動機面開始說明為何需要全棧式監控與告警平台,接著介紹過去一季講者如何使用普羅米修斯(Prometheus)與 Grafana 針對網路層、實體機器、虛擬機器、容器、中介軟體層(Ex. Apache Cassandra、Apache Kafka、CNCF Fluentd)、應用程式層來建立資料串流(Data Pipeline)的監控儀表板。礙於無法展示真實公司的環境,本分享將使用 Docker Compose 進行全棧式監控與告警平台的概念,也藉此逐一介紹搭建全棧式監控與告警平台會用到哪些普羅米修斯(Prometheus)的各類資料蒐集器(Exporter)。
As a Hybrid Cloud Operator, are you tired of collecting monitor metrics from different monitor services? As a Developer, do you need historical application and infrastructure metrics to debug or improve application performance? In this talk, I'll first talk about why should we build Full Stack Monitor and Notification with Prometheus and Grafana. I'll share my personal experience about monitoring network devices, physical machines, virtual machines, docker containers, Middleware (Ex. Apache Cassandra, Apapche Kafka, CNCF Fluentd) and Application metrics. I'll demonstrate an End-to-End Data Pipeline Dashboard with Docker Compose examples and introduce different kinds of Prometheus Exporter used for different monitor targets.
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionEtu Solution
講者:Informatica 資深產品顧問 | 尹寒柏
議題簡介:Big Data 時代,比的不是數據數量,而是了解數據的深度。現在,因為 Big Data 技術的成熟,讓非資訊背景的 CXO 們,可以讓過去像是專有名詞的 CI (Customer Intelligence) 變成動詞,從 BI 進入 CI,更連結消費者經濟的脈動,洞悉顧客的意圖。不過,有個 Big Data 時代要 注意的思維,那就是競爭到最後,不單只是看數據量的增長,還要比誰能更了解數據的深度。而 Informatica 正是這個最佳解決的答案。我們透過 Informatica 解決在企業及時提供可信賴數據的巨大壓力;同時隨著日益增高的數據量和複雜程度,Informatica 也有能力提供更快速彙集數據技術,從而讓數據變的有意義並可供企業用來促進效率提升、完善品質、保證確定性和發揮優勢的功能。Inforamtica 提供了更為快速有效地實現此目標的方案,是精誠集團在 Big Data 時代的最佳工具。
Using apache mx net in production deep learning streaming pipelinesTimothy Spann
As a Data Engineer I am often tasked with taking Machine Learning and Deep Learning models into production, sometimes in the cloud and sometimes at the edge. I have developed Java code that allows us to run these models at the edge and as part of a sensor/webcam/images/data stream. I have developed custom interfaces in Apache NiFi to enable real-time classification against MXNet models directly through the Java API or through DJL.AI's Java interface. I will demo running models on NVIDIA Jetson Nanos and NVIDIA Xavier NX devices as well as in the cloud.
# Technologies Utilized:
# Apache MXNet, DJL.AI, NVIDIA Jetson Nano, NVIDIA Jetson XAVIER, Apache NiFi, MiNIFi, Java, Python.
this is a presentation on hadoop basics. Hadoop is an Apache open source framework written in java that allows distributed processing of large datasets across clusters of computers using simple programming models.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
解讀雲端大數據新趨勢
1. My Journey of “Innovation”
( aka “From Zero to One” )解讀雲端大數據新趨勢
Big Data Stack on The Cloud
Jazz Yao-Tsung Wang
Initiator and Chair, TDEA
Data Architect, TenMax
Shared at 2018-05-16 < iThome Cloud Summit 2018 >
2. Hello!
I am Jazz Wang
Co-Founder of Hadoop.TW
Initiator and Chair of Taiwan Data Engineering Association (TDEA)
Hadoop Evangelist since 2008.
Open Source Promoter. System Admin (Ops).
- 11 years (2002/08 ~ 2014/02) Associate Researcher in HPC field.
- 2 years (2014/03 ~ 2016/04) Assistant Vice President (AVP),
Product Management of ‘Big Data Platform Management’
- 2 years (2016/04 ~ Now) Data Architect of Real-Time Bidding
You can find me at @jazzwang_tw or
https://fb.com/groups/dataengineering.tw
https://slideshare.net/jazzwang
2
33. K8S Big Data SIG
33
▷ Big Data SIG
Covers deploying and operating big data applications (Spark,
Kafka, Hadoop, Flink, Storm, etc) on Kubernetes. We focus on
integrations with big data applications and architecting the
best ways to run them on Kubernetes.
▷ Big Data SIG
○ K8S
Design and architect ways to run big data applications effectively on Kubernetes
○ Discuss ongoing implementation efforts
○
Discuss resource sharing and multi-tenancy (in the context of big data applications)
○ K8S
Suggest Kubernetes features where we see a need