"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
Hitachi streaming data platform v8
1. Our societal infrastructure has been trans-
formed by digitization of our industries, the
advent of social media, cloud and virtualiza-
tion technologies are leading the exponential
growth of data that we have never seen
before.
As a result, the amount of data handled by
data processing systems continues to grow
daily. The ability to quickly summarize and
analyze this data can provide us with valuable
new insights. To be useful, any real-time data
processing system must have the ability to
create new value from the massive amounts
of data that is being created every second.
Why Streaming Data Analytics?
Businesses need to adapt to the digital era
and need;
1) Real-time Analytics - The need for taking
actions at the right time and place are
increasingly becoming an imperative for
businesses.
2) Perishable Insights - The need for captur-
ing insights which might otherwise be lost.
3) Temporal insights – Time is a key dimen-
sion in analysis, the need for using time in
analysis and the ability to correlate multiple
streams of data across time windows is key
for making decisions.
4) Data is born‘Distributed’- It is efficient
and effective to analyze and gain insights as
the data is generated rather than collecting
it at a single central location and then per-
forming the analysis.
The Hitachi Streaming Data Platform (HSDP)
responds to this challenge by giving you
the ability to perform stream data process-
ing in real-time and at scale in a distributed
architecture.
Deliver more timely insights from your‘Big Data’with a real-time analytics solution
that enables real-time control and the ability to automate decisions and transition
your business into the‘Digital age’.
Hitachi’s Streaming Data Platform is a real-time streaming software solution that is
mature, highly scalable, easily adaptable to cross-industry real-time uses cases and
integrates with open source technologies and leverages your existing investments.
DATASHEET
Hitachi’s Streaming Data Platform -‘Delivering Real-time control for Digital
Transformation’.
Adding HSDP to your data processing system gives you a tool that is designed for processing these large volumes of data from
various data sources. This engine is highly scalable and adaptable to any industry application and can be integrated with Open
source technologies to deliver analytics application in a agile environment.
2. Nequiant iatibus nume ipsandaeri beaqui
vereperume et laborerorio. Ita quam, volor
repudit quodici tiusda sit eos unt endipic iisquo
ipis rehendiae laboribus ea voluptiaes seque am,
officil imagnati que dolestem libus eu.
DATASHEET
Hitachi Streaming Data Platform
Features:
HSDP uses both in-memory processing and
incremental computational processing,
which allows it to quickly process large sets
of time-sequenced data.
1) High-speed processing of large
sets of time-sequenced data
includes:
• In-memory processing
With in-memory processing, data is
processed while it is still in memory, thus
eliminating unnecessary disk access.
When processing large data sets, the
time required to perform disk I/O can be
significant. By processing data while it is
still in memory, Streaming Data Platform
avoids excess disk I/O, enabling data to be
processed faster.
• Incremental computational processing
With incremental computational process-
ing, a pre-loaded query is processed iter-
atively when triggered by the input data,
and the processing results are available
for the next iteration. This means that the
next set of computations does not need to
process all of the target data elements; only
those elements that have changed need to
be processed.
Hitachi’s Streaming Data Platform - A real-time data processing engine that analyzes
the“right now”
2) Processing and analysis is
simplified using a popular widely
used language called CQL similar
to SQL
CQL (Continuous Query Language) is an
extension of traditional SQL. CQL executes
in memory, designed for high throughput
and low latency environments. Its has a
‘windowing’concept that allows the system
to treat each stream, packet and flow indi-
vidually and allows for‘stateful’analysis
unlike open source technologies where this
capability has to be custom coded.
Hitachi CQL provides the capability to
centrally develop and globally deploy
applications.
3) Temporal Analysis
Time is an important dimension for stream-
ing analytics. HSDP natively supports
temporal analysis of data. The data tuples
can be analyzed based on arrival time or
created time.
4) Developing Custom Processing
and Analytics Using Extensions in
JAVA
HSDP provides APIs for plugging in custom
extensions where developers can integrate
their machine learning algorithms devel-
oped in R or Python. The implementation
of the API can be either in C or Java and
invoked by CQL on the data stream.
5) Scale Up or Scale Out
In scale up scenario query groups can
analyze in parallel multiple threads on the
same HSDP instance. In scale out, query
groups can analyze in parallel on multiple
HSDP instances.
6) Virtualization
HSDP Software can run on a VM machine
and even on a docker container.
7) Multi-stage Geo Distributed
Architecture
Designed for realizing solutions that are
deployed in a geo-distributed environment
crucial for IoT application - where insights
and actions are desired at the edges and at
the core.
8) SDK Kit
HSDP provides SDK kit for ingesting data
from a variety of sources and publishing
insights to a vareity of targets. SDK enables
HSDP to blend in with your your custom
environment and existing investments.
Hitachi Streaming Data Platform solution component provides a big data engine and an application framework
to define and customize specific application requirement.
Operation
info.
Communication
data
IC Card
Network
Input
Info.
Results
Massive:Real;world Info.
Analyze data
at the moment when it arrives Drawing and notification:of result
Analysis: result
Dashboard
Result file
Stream Data Platform
a,15
a,1
b,2 a 15
b 6
1. Analysis processing of the time;series data
3. In;memory incremental calculation
a,1
b,2
a,4
b,6
a,9
a,3
b,4
a,5
a,6
Analyzing Scenarios
2. Data analyzing scenarios in CQL
1.:Sliding:windows:enable:unbounded:time;series:of:data:streams
2.:Continuous:Query:Language:(CQL):offers:high:productivity
3.:In;memory:incremental:calculation:realizes:high:performance.
Sliding windows
3. HSDP’s Fit in Geo-Distributed Analytics
Architecture
Dashboard
AlertIndustry Specific
Analysis Scenario
Historical
aggregates
Heterogeneous Geo-
Distributed Sources
Edge Insights
at Edge Data Center
Aggregated Insights
at Core Data Center
Heterogeneous
targets
Sensor
HSDP-L
Edge DC
HSDP-E
Central DC – On-premises or Public cloud
Machine
Learning
NoSQL / RDB/
HDFS
Device
HSDP-L
Machine
HSDP-L
HVAC
HSDP-L
Network
HSDP-L
Set-top Box
HSDP-L
Edge DC
HSDP-E
Edge DC
HSDP-E
Edge DC
HSDP-E
Edge DC
HSDP-E
Edge DC
HSDP-E
HSDP – E
SCADA
HSDP-L
HSDP Lite or HSDP Client
HSDP-E
HSDP EnterpriseHSDP-E
HSDP-L
Data is born distributed. This implies that the analysis will need to be distributed to enable business systems to“act now”.
Ability to capture insights that can perish, requires a‘Geo-Distributed Analytics Architecture’ as depicted above which enables the possibilties
to create insights that drive new ways to optimize, build efficient processes, new business models to enable the digital transformation.
Edge Computing will become more important as Internet of Things enable the possibilities for new innovative products and services. The abil-
ity to deliver‘service assurance’for these new products and services will become important and a distributed edge analytics capability will be
needed for businesses to compete and stay ahead in the‘digital economy’.
Internet of Things - Introduces New Complexities to
Analytics
4. Use Cases
Use Case Vertical Solution Description Business Benefits
Smart Power
Grids
Smart Energy Monitor the condition of electricity consumption
and generation by balancing the“demand”and
“supply”
Proactive problem handling by detecting abnor-
mality signs in sensor data.
Prevention of large-scale blackout through detect-
ing system failure signs
Improvement in measurement-based decision
support.
Using HSDP, energy customers of
Hitachi are able to achieve better
operating efficiencies, make better
decisions and achieve better cus-
tomer satisfaction
Smart Industry
- HVAC and
Efficient energy
consumption
Smart Industry
Utility Systems
Reduction in electricity cost of the data centers by
optimizing air conditioning systems.
Continuous visualization of key metrics 24/7/365
Non intrusive monitoring of the air conditioning
systems using data captured from -“AirSense”
sensor sold by Hitachi
Solution built using HSDP Industrial
utility customers of Hitachi that were
able to achieve better operating effi-
ciencies and reduce costs
Disaster Response
Management
Smart City Real-time availability of events and the ability to
correlate events from devices in affected places
aided better decision making
Using HSDP, local government agen-
cies are able respond to disasters in
more effective ways
Efficient
Production Line
Systems
Smart Production Real-time manufacturing line monitoring enables
not only to check the operability of each process
but also to trace the defect products made on the
line easily.
Customers are able to isolate, predict
and reduce down time in production
line systems
High Speed
Index Service
for Financial
Exchanges
Smart Finance Investors can quickly get a hold of and trade on
detailed market movements.
Investors can trade using“best quote indexes”as
indicators with less tracking error
Tokyo stock exchange was able offer
one of the world’s most advanced
“High-Speed Index Service”
Network Analytics
and Optimization
Telecom Service
Providers
Alleviate congestion through teal-time pre-
scriptive analytics to detect congested cell sites.
Enabling optimization and feedback loop by
recommending actions for traffic shaping or
compression.
Enabled large Tier 1 Operator to
realize 15% Capex saving through
reduction on new cell site roll-outs.
5. DATASHEET
HITACHI is a trademark or registered trademark of Hitachi, Ltd. [If trademarks from Hitachi, Hitachi Data Systems, Microsoft or IBM are used, add them here per our guidelines:
[https://apps.hds.com/brand/guidelines-and-standards/writing-guidelines/copyrights-and-trademarks.html]
BR-00000 Month 20XX
Regional Contact Information
Americas: +1866 374 5822 or info@hds.com
Europe, Middle East and Africa: +44 (0) 1753 618000 or info.emea@hds.com
Asia Pacific: +852 3189 7900 or hds.marketing.apac@hds.com
Corporate Headquarters
2845 Lafayette Street
Santa Clara, CA 95050-2639 USA
www.HDS.com community.HDS.com
WHY HSDP ?
Key Business Benefits of Hitachi Streaming Data Platform
Granular Analytics for Improved Visibility:
Fine-grain granularity through real-time analytics means performance visibility at the millisecond level!
Now service-level agreements (SLAs) for mission-critical traffic can be assured based not on“averages,”but on actual“visible”evidence.
Scalable Analytics for Cost Efficiency:
With Hitachi Streaming Data Platform, you can apply super-linear scaling and distributed processing to grow as your analytic needs increase from
hundreds of megabits per second to tens of gigabits per second of streaming traffic, without being limited by physical boxes and costly upgrade
curves.
Improve customer experience and retention:
With this real-time engine you have the insights to know what happened and what‘is’happening with your operations. From an operations per-
spective you can now diagnose (MTTD) and troubleshoot problems much faster, reduce service restoration time (MTTR) and thereby improve the
overall customer experience and confidence in the network and operations.
No more science projects:
Get real-time analytics at scale delivered faster and see the benefits as you transform your organization into the digital age.
Integrate with existing Open Source technologies:
Continue to leverage your existing investments in open sources technologies and easily integrate HSDP to enable real-time and distributed analyt-
ics use cases.
■■ Big and Fast Data
- Volume, Velocity, Variety
- Low Latency and High throughput
- Scalability and Availability
- Integration with Storm/Spark
- Integration with Kafka
- Integration with AWS/Azure stacks
■■ Open Platform
- Ingest from a variety of sources
- Descriptive, Predictive and Prescriptive
Analytics
- Publish to a variety of targets
- Support Popular Programming interfaces:
CQL, Java, C, R, Python
■■ Flexible Deployment
- Cloud or Core Data center
- Edge data center
- Remote Data sources
- Virtualization and Docker ready
Industry Leader: 50+ Stream
Data Processing Patents
6. System Requirements
■■ OPERATING SYSTEM
- RHEL 6.4, 6.5, 6.6
- SUSE 11 SP2, SP3
■■ VM
- KVM
-ESXi 5.1, 5.5, 6.0
■■ Java - 1.8
■■ HSDP can be configured to run on limited resource or in a
clustered environment in a data center
■■ Depending upon the use case, HSDP provides 2 types of
engines – HSDP C-Engine and HSDP Java Engine.
■■ In scenarios where scale-up is the only option and high per-
formance is desired, HSDP C-Engine is recommended.
■■ *Up to 1 Mtps/core throughput can be realized using
the C-Engine
■■ In scenarios where scale-out is an option, HSDP Java engine
can be used
■■ *Up to 20K-30 Ktps/core can be realized using the Java
■■ engine
■■ Also, Java supports richer support for CQL
■■ * Please note performance numbers may vary based on the
analysis scenario/use case.
System Requirements and Performance
Performance
Corporate Headquarters
2845 Lafayette Street
Santa Clara, CA 95050-2639 USA
www.HDS.com community.HDS.com