SlideShare a Scribd company logo
1 of 4
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2241
An Overview of Data Lake
Pragati Kumai1, Smitha G R2
1,2 R.V College of Engineering, Bengaluru, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Data Lake is one of the contentious concepts that
emerged during the Big Data era. The ideaforDataLakecame
from the business world rather than the academic world. As
Data Lake is a newly conceived concept with revolutionary
concepts, its adoption presents numerous challenges. The
potential to change the data landscape, on the other hand,
makes Data Lake research worthwhile and the data lake is a
highly flexible repository that can store both structured and
unstructured data and employs a schema-on-read strategy. It
is an effective approach for today's problem on Big Data
Storage. However, it has a few defects, such as inadequate
security andiauthenticationimechanisms. Apache Hadoop is
usually recognized as a data lake industry standard. Its
parallel processing mechanisms enable rapid processing of
huge amounts of data. Many businesses have attempted to
create wrappers for Hadoop in order to address issues about
its raw state and poor data security. This includes platforms
such as Amazon Web Services (AWS) Data Lake and Azure
Data Lake. AWS Data Lakes offer simple solution with
safeguards to prevent data loss, whereas Azure Data Lakes
offer greater adaptability and organisation-level security.
Key Words: Big Data, Data Warehouse, OLAP, OLTP,
Data Lake, Apache Hadoop
1. INTRODUCTION
A data lake offers businesses a scalable and safe platform
that enables them to: ingest any data from any systematany
speed,iwhether it originates from on-premises, cloud, or
edge computing systems; store any type or volumeofdata in
full fidelity; process data in real time or batch mode; and
analyse data using SQL, Python, R, or any other language,
third-party data, or analytics applications. Businesses that
successfully get business value from their data will perform
better than their competitors. According to an
Aberdeenistudy, businesses who used data lakes
outperformed comparable businesses in terms of organic
revenue growth by 9%. These leaders were able to use fresh
data from sources includinglogfiles,click-streamdata,social
media, and internet-connected devices housed in the data
lake to perform new forms of analytics like machine
learning. This made it easier for them to recognise and take
advantage of business growth prospects by bringing in and
keeping clients, increasing productivity, maintaining
equipment proactively, and making wise judgments. A
typical organisation will need both a data warehouse and a
data lake, depending on the requirements,ias they eachfulfil
different needs and use cases.
A data warehouse is a database designed for the analysis of
relational data fromcorporateapplicationsandtransactional
systems. In order to optimise for quick SQL queries, the data
structure and schema are set in advance. The results are
often utilised for operational reportingandanalysis.Inorder
for data to serve as thei"single sourceoftruth"thatuserscan
rely on, information is cleansed, enriched, and transformed.
2. DATA LAKE CONCEPTS
The fundamental goal of a data lake is to ingest
unprocessed data and process it later. Therefore, data lakes
keep all the information and have a good flexibility.
However, data lakes, which include a large number of
datasets lacking precise models or descriptions,areprone to
quickly becoming invisible, impenetrable, and inaccessible.
establishing a metadata controlisystem for DL is therefore
required. In fact, many articles have underlined the value of
metadata [10]. Big data can also refer to technological
advancements in data processing and storage that enable
handling of exponential growth in data volumeinanytypeof
format [3]. Thei3-V model [7], which consists of three
dimensions of issues in data growth: volume, velocity, and
variety, is the foundation for another widely accepted
definition of big data. The increasing amount of data is
referred to as volume. Theiterms "velocity" and
"accessibility" refer to the rates at which new data are
generated and made available forfurtheranalysis.Therange
of various data types and sources is characterised by
variety.Ref [9] proposed more Data Lake specifications,
particularly from the perspective of the business domain
rather than the scientific community.
The data is loadedifrom the source systems
 No data is rejected.
 At the leaf level, data is stored in an untransformed
or in an untransformed state.
A DL can be accessed by multiple people and ingests and
stores a variety of data kinds. Numerous problems with
accessing, searching, and analysing data may arise if
properimanagement techniques are not in place [2].
Governance for data lakes is necessary in this situation. To
uncover some partial solutions in the cutting-edge work. [6]
suggests a "Just-enough Governance" for DLs with policies
for data quality, data on-boarding, metadata management,
andicompliance & audit. According to [4], data governance
must guarantee data availability and quality throughout the
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2242
whole data life-cycle.[5]provideddata governance with data
life-cycle management,idata lineage, data quality, and
security.
2.1 Data Lake (Architectural Implementation)
The architecture of a Data Lake can be divided into five
layers :
1. IngestioniLayer
2. DistillationiLayer
3. Processing Layer
4. InsightsiLayer
5. Unified OperationsiLayer
Ingestion of Raw Data into the Data Lake is the goal of the
Data Lake Architecture's Ingestion Layer. In this layer, there
is no data alteration.
The layer can take in Raw Data in batches or in real-time,
and itiorganises the data into a logical folder structure. The
Ingestion Layer can retrieve data from a variety of external
sources, including social networkingiwebsites, wearable
technology, Internet of Things (IoT) devices, and data
streaming equipment. This layer's advantage is that it can
swiftly assimilate any kind of data, such as:
 Streams of video from surveillance cameras.
 Health monitoring equipment data in real time.
 Many telemetry data types.
 Geographical information, pictures, and videos
taken with mobile devices.
Thisisecond phaseentailsstrengtheningthecapacityfordata
transformation andanalysis.Companiesemploythetool that
best suits their skill set at this point. More data are being
collected, and applications are being developed. The
enterprise data warehouse and data lake's capabilities are
combined here.
In thirdilayer, data can be kept in files or tables after being
processed into usable data sets. At this point, both the data's
purpose and its structure are known. Before this layer, you
should anticipate purging and transformations.
Denormalization and consolidation of various objects are
also frequent. This is the most complicated element of the
entire Data Lake solution because of everything mentioned
before. The framework is pretty plain and easy to
understand in terms of how to organise your data. For
instance: Files, Type, and Purpose. End users typically only
have access to this tier.
Theifourth layer consists of the Data Lake's query interface
and output interface. It makes requests for or retrieves data
from the Data Lake using SQL and NoSQL queries. Typically,
enterprise users who require access to the data run the
queries. The layer that displays the data to the user for
viewing after it has been fetchedfromtheData Lake.Reports
and dashboards are frequently the results of searches, and
they make it simple for users to draw conclusions from
theiunderlying data.
Speaking of the final phase of general data flow ina data lake
design is the consumption zone. Through the analytic
consumption tools and SQL and non-SQL queryicapabilities
at this layer, the findings and business insights fromanalytic
projects are made available to the targeted users, be they
technical decision-makers or business analysts.
2.2 Proposal for Improvement of Data Lake
Because a complete investigation has not been performed,
analysts in data lakes sometimes struggle to assess the
qualityiof the data. Additionally, since there is no account of
the history of conclusions made by earlier analysts, it is
impossible to incorporate insights from others who have
worked with the data. Finally,securityandaccesscontrol are
two of the main dangers associated with data lakes. Without
any control, data can be thrown into a lake, some of which
may have privacy and legal restrictions that other dataidoes
not.
The Swamp Problem : Aimassive data lake will accept any
data you offer it.
Unless all the data in your present systems is flawless and of
the greatest calibre possible: there are no formatting errors,
no missing values, no typos, and no data ever placed into the
wrong field. Additionally, theiinformation in all of your
present systems is flawlessly consistent with one another.
Then there won't be an issue.
Solution : There is more to a data lake than just having one.
The goal isito transform data into knowledge, knowledge
into action, and knowledge into action—all guided by the
individuals who are most familiar with your data.
Metadata management is how the tale of the data by
supplying context and explanations about the data's origin,
usefulness, quality,andsignificance.Evenwith relativelytiny
volumes of data, most organisations stillihave trouble
managing metadata.
Machine learning has the potential to change the game
because it can extract tacit knowledge from those who
understand data the best and transform it into algorithms
that can be utilised to scale-up automated data processing.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2243
The developmentiof data quality systems is the topic of
another type of frameworks. In order to understand best
practises in mature businesses and pinpoint areas for
growth, they strive to evaluatetheDQmanagementmaturity
level. Popular examples of these frameworks include
CapabilityiMaturity Model Integration (CMMI) and Total
DataiQuality Management (TDQM) (CMMI),etc.
An efficientidata quality model gives business data integrity
actions direction so that they are all built upon the same
comprehensiveunderstanding.Itlessenscomplexityiwithout
limiting the ability to iterate and evolveascomprehensionof
the data advances.
3. COMPARISON BETWEEN DATA LAKE & DATA
WAREHOUSE
Both data lakes and data warehouses are often used for
holding huge data, however the phrases are not
exchangeable. A data lake isa largepool ofunstructureddata
with no clear purpose. A data warehouse is a storage
location for organized, filtered data that has previouslybeen
processed for a particular purpose. The two methodsofdata
storage are frequently mistaken, yet they are far more
dissimilar than they are similar. The only true resemblance
between them is the high-level purpose of data storage. The
distinction is significant since they perform various
functions and need the employment of separate sets of eyes
to be adequately maximized. A data lake may be appropriate
for one firm, whereas a data warehouse may be more
appropriate for another.
The main difference between them are as follows:
 iData structure: iiraw vs. iiprocessed: Most of the data
in data lakes is unprocessed andraw.Unprocesseddata
for a purpose is known as raw data. Raw data is
convenient for machine learning and is simple to
examine. Whereas, data warehouses, hold processed
data. In contrast to raw data, processed data is easily
comprehended by a wide number of individuals. It also
saves money on costly storage space. The concern is
that a big volume of data may occasionally turn into
data swamps, with some data never being utilised.
 iPurpose: iUndetermined vs iiOperational:Compared
to data warehouses, data lakes contain less structure
and filtration of the data. It is referred to asprocessed
data whenever the raw data is used for a certain
purpose. This indicates that the data that is stored is
not squandered and will be put to use inside the
company.
 iUsers: Dataiiscientists vs business iiprofessionals:
People who are not used to working with raw data
frequently find it challenging to exploredata lakes.To
comprehend and transform raw, unstructured data
for any specific commercial application, it often takes
a data scientist and special equipment. To make
processed data readable by the majority, if not all, of
the personnel at a corporation, it is utilized in charts,
spreadsheets, tables, and other formats. Processed
data, such as that kept in data warehouses, simply
needs the user to be knowledgeable about thesubject
matter.
 iAccessibility: iiFlexible vs iisecure: The lack of
structure in data lake design makes it flexible and
simple to modify. Data lakes have extremely minimal
restrictions, thus any modifications to the data may
be made fast. Data warehouses are highly
organizedibyidesign, which makes it expensive and
harder to alter them. The fact that data is processed
and structured in a way that makes it easier to
understand is one of the main benefits of data
warehouse design.
4. CHALLENGES OF DATA LAKE
It may be quite expensive to set up and manage datalakes.
Although managedicloud data platforms are simpler to
implement, they are still challenging to administeriandhave
high costs. Data lakes may be challenging to maintain even
for experienced engineers. Whether a basiciopen-source
cloud data platform or a managed service is utilized, host
infrastructure has the bandwidth for the data lake to keep
expanding, dealingiwith duplicateidata is made sure,
safeguarding all of the data, and other suchichores are all
challenging jobs. Traditional cloud data platformsiare
effective at storing data but not so good at safeguarding it or
enforcing data governancepolicies.Thathavetoaddsecurity
and governance. That means extraitime, money, and
managementiheadaches. All members of the organization
have access to the data lake. Its practical use, however,isnot
equally available to everyone. Since the data lake also
contains unstructured data, non-technicaliusers may find it
challenging to understand the data.
5. DISCUSSION AND CONCLUSION
The data lake is a data storage depository that provides
companies with a strong potential due to its capacity to
handle massive amounts of all sorts of data. This ability
should be treated seriously in order to avoid a data swamp
of outdated, streaming, and forgotten information. Theideal
method to enhance the benefits of this repository type is to
choose a solution in which the stream goes both ways: the
group's data is streamed in to consolidate, backup, and
protect from a single central hub, while AI application
enables for an overflow of transparency and business
insights. Data Lake is a relatively new notion that is
expanding in parallel with the popularity of cloud, data
science, and artificial intelligence applications. It has
acquired popularity in the market because to its adaptable
architecture and the application or data type it supports,
which enables businesses to gather a comprehensive
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2244
pictureiof patterns in data. Data lake solutions are useful for
record keeping because they offer enterprises with a
centralised location to backup and secure their data. Each
data element in the lake is labelled with a set of metadata
tags, allowing users to derive its most information from
whatever is saved. Rather than having separate data
collections, all data to be stored is collected in DL to address
the old issue. The new problem is dealing with the
difficulties of the big data period, i.e., data lakes attempt to
overcome the difficulties demanded by big data V's - value,
volume, variety, verity, andvelocity.Data silosareverylikely
if data produced or developed by various departments
within the enterprise is only stored in their data stores.Data
Lake is a relatively new notion that is expanding in parallel
with the popularity of cloud, data science, and artificial
intelligence applications. It has acquired popularity in the
market because to its adaptable architecture and the
application or data type it supports, which enables
businesses to gather a comprehensive picture of patterns in
data.
ACKNOWLEDGEMENT
This work would not have been possible without the
contribution of Smitha G R, AssistantProfessor-RVCollegeof
Engineering, Aggarwal Akash, Data Analytics Lead-Koch
Business Solutions. I am very thankful for all the help they
provided throughout the course of this work.
REFERENCES
[1] Brian Stein, Alan Morrison,” The enterprise data lake:
Better integration and deeper analytics, Technology
Forecast: Rethinking integration”, Issue 1, 2014,
Retrieved 25, Aug. 2017.
[2] Timothy King “The Emergence of Data Lake: Pros and
Cons”, March 3, 2016, Retrieved Sep 15, 2017.
[3] Timothy King “The Emergence of Data Lake: Pros and
Cons”, March 3, 2016, Retrieved Sep 15, 2017.
[4] Huang Fang, Managing Data Lakes in Big Data Era:
What’s a data lake and why has it became popular in
data management ecosystem, The 5th Annual IEEE
International Conference on Cyber Technology in
Automation, Control and Intelligent Systems, June 8-12,
2015, Shenyang, China.
[5] Huang Fang, Managing Data Lakes in Big Data Era:
What’s a data lake and why has it became popular in
data management ecosystem, The 5th Annual IEEE
International Conference on Cyber Technology in
Automation, Control and Intelligent Systems, June 8-12,
2015, Shenyang, China.
[6] Huang Fang, Managing Data Lakes in Big Data Era:
What’s a data lake and why has it became popular in
data management ecosystem, The 5th Annual IEEE
International Conference on Cyber Technology in
Automation, Control and Intelligent Systems, June 8-12,
2015, Shenyang, China.
[7] Huang Fang, Managing Data Lakes in Big Data Era:
What’s a data lake and why has it became popular in
data management ecosystem, The 5th Annual IEEE
International Conference on Cyber Technology in
Automation, Control and Intelligent Systems, June 8-12,
2015, Shenyang, China.

More Related Content

Similar to An Overview of Data Lake

CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSIS
CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSISCASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSIS
CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSISIRJET Journal
 
Big Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewBig Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewIRJET Journal
 
The Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their UsageThe Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their UsageIRJET Journal
 
Data Virtualization: An Essential Component of a Cloud Data Lake
Data Virtualization: An Essential Component of a Cloud Data LakeData Virtualization: An Essential Component of a Cloud Data Lake
Data Virtualization: An Essential Component of a Cloud Data LakeDenodo
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...Denodo
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)Denodo
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeDATAVERSITY
 
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesData Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesDenodo
 
IRJET- Data Analytics & Visualization using Qlik
IRJET- Data Analytics & Visualization using QlikIRJET- Data Analytics & Visualization using Qlik
IRJET- Data Analytics & Visualization using QlikIRJET Journal
 
Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Denodo
 
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...IRJET Journal
 
An Overview of General Data Mining Tools
An Overview of General Data Mining ToolsAn Overview of General Data Mining Tools
An Overview of General Data Mining ToolsIRJET Journal
 
Paper Final Taube Bienert GridInterop 2012
Paper Final Taube Bienert GridInterop 2012Paper Final Taube Bienert GridInterop 2012
Paper Final Taube Bienert GridInterop 2012Bert Taube
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Denodo
 
Analysis on Deduplication Techniques for Storage of Data in Cloud
Analysis on Deduplication Techniques for Storage of Data in CloudAnalysis on Deduplication Techniques for Storage of Data in Cloud
Analysis on Deduplication Techniques for Storage of Data in CloudIRJET Journal
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationDenodo
 
Data Lakes: A Logical Approach for Faster Unified Insights
Data Lakes: A Logical Approach for Faster Unified InsightsData Lakes: A Logical Approach for Faster Unified Insights
Data Lakes: A Logical Approach for Faster Unified InsightsDenodo
 
Data Democratization for Faster Decision-making and Business Agility (ASEAN)
Data Democratization for Faster Decision-making and Business Agility (ASEAN)Data Democratization for Faster Decision-making and Business Agility (ASEAN)
Data Democratization for Faster Decision-making and Business Agility (ASEAN)Denodo
 
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...YogeshIJTSRD
 
Fbdl enabling comprehensive_data_services
Fbdl enabling comprehensive_data_servicesFbdl enabling comprehensive_data_services
Fbdl enabling comprehensive_data_servicesCindy Irby
 

Similar to An Overview of Data Lake (20)

CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSIS
CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSISCASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSIS
CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSIS
 
Big Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewBig Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A Review
 
The Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their UsageThe Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their Usage
 
Data Virtualization: An Essential Component of a Cloud Data Lake
Data Virtualization: An Essential Component of a Cloud Data LakeData Virtualization: An Essential Component of a Cloud Data Lake
Data Virtualization: An Essential Component of a Cloud Data Lake
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-Purpose
 
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesData Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
 
IRJET- Data Analytics & Visualization using Qlik
IRJET- Data Analytics & Visualization using QlikIRJET- Data Analytics & Visualization using Qlik
IRJET- Data Analytics & Visualization using Qlik
 
Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)
 
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
 
An Overview of General Data Mining Tools
An Overview of General Data Mining ToolsAn Overview of General Data Mining Tools
An Overview of General Data Mining Tools
 
Paper Final Taube Bienert GridInterop 2012
Paper Final Taube Bienert GridInterop 2012Paper Final Taube Bienert GridInterop 2012
Paper Final Taube Bienert GridInterop 2012
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
 
Analysis on Deduplication Techniques for Storage of Data in Cloud
Analysis on Deduplication Techniques for Storage of Data in CloudAnalysis on Deduplication Techniques for Storage of Data in Cloud
Analysis on Deduplication Techniques for Storage of Data in Cloud
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
 
Data Lakes: A Logical Approach for Faster Unified Insights
Data Lakes: A Logical Approach for Faster Unified InsightsData Lakes: A Logical Approach for Faster Unified Insights
Data Lakes: A Logical Approach for Faster Unified Insights
 
Data Democratization for Faster Decision-making and Business Agility (ASEAN)
Data Democratization for Faster Decision-making and Business Agility (ASEAN)Data Democratization for Faster Decision-making and Business Agility (ASEAN)
Data Democratization for Faster Decision-making and Business Agility (ASEAN)
 
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
 
Fbdl enabling comprehensive_data_services
Fbdl enabling comprehensive_data_servicesFbdl enabling comprehensive_data_services
Fbdl enabling comprehensive_data_services
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 

Recently uploaded (20)

VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 

An Overview of Data Lake

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2241 An Overview of Data Lake Pragati Kumai1, Smitha G R2 1,2 R.V College of Engineering, Bengaluru, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Data Lake is one of the contentious concepts that emerged during the Big Data era. The ideaforDataLakecame from the business world rather than the academic world. As Data Lake is a newly conceived concept with revolutionary concepts, its adoption presents numerous challenges. The potential to change the data landscape, on the other hand, makes Data Lake research worthwhile and the data lake is a highly flexible repository that can store both structured and unstructured data and employs a schema-on-read strategy. It is an effective approach for today's problem on Big Data Storage. However, it has a few defects, such as inadequate security andiauthenticationimechanisms. Apache Hadoop is usually recognized as a data lake industry standard. Its parallel processing mechanisms enable rapid processing of huge amounts of data. Many businesses have attempted to create wrappers for Hadoop in order to address issues about its raw state and poor data security. This includes platforms such as Amazon Web Services (AWS) Data Lake and Azure Data Lake. AWS Data Lakes offer simple solution with safeguards to prevent data loss, whereas Azure Data Lakes offer greater adaptability and organisation-level security. Key Words: Big Data, Data Warehouse, OLAP, OLTP, Data Lake, Apache Hadoop 1. INTRODUCTION A data lake offers businesses a scalable and safe platform that enables them to: ingest any data from any systematany speed,iwhether it originates from on-premises, cloud, or edge computing systems; store any type or volumeofdata in full fidelity; process data in real time or batch mode; and analyse data using SQL, Python, R, or any other language, third-party data, or analytics applications. Businesses that successfully get business value from their data will perform better than their competitors. According to an Aberdeenistudy, businesses who used data lakes outperformed comparable businesses in terms of organic revenue growth by 9%. These leaders were able to use fresh data from sources includinglogfiles,click-streamdata,social media, and internet-connected devices housed in the data lake to perform new forms of analytics like machine learning. This made it easier for them to recognise and take advantage of business growth prospects by bringing in and keeping clients, increasing productivity, maintaining equipment proactively, and making wise judgments. A typical organisation will need both a data warehouse and a data lake, depending on the requirements,ias they eachfulfil different needs and use cases. A data warehouse is a database designed for the analysis of relational data fromcorporateapplicationsandtransactional systems. In order to optimise for quick SQL queries, the data structure and schema are set in advance. The results are often utilised for operational reportingandanalysis.Inorder for data to serve as thei"single sourceoftruth"thatuserscan rely on, information is cleansed, enriched, and transformed. 2. DATA LAKE CONCEPTS The fundamental goal of a data lake is to ingest unprocessed data and process it later. Therefore, data lakes keep all the information and have a good flexibility. However, data lakes, which include a large number of datasets lacking precise models or descriptions,areprone to quickly becoming invisible, impenetrable, and inaccessible. establishing a metadata controlisystem for DL is therefore required. In fact, many articles have underlined the value of metadata [10]. Big data can also refer to technological advancements in data processing and storage that enable handling of exponential growth in data volumeinanytypeof format [3]. Thei3-V model [7], which consists of three dimensions of issues in data growth: volume, velocity, and variety, is the foundation for another widely accepted definition of big data. The increasing amount of data is referred to as volume. Theiterms "velocity" and "accessibility" refer to the rates at which new data are generated and made available forfurtheranalysis.Therange of various data types and sources is characterised by variety.Ref [9] proposed more Data Lake specifications, particularly from the perspective of the business domain rather than the scientific community. The data is loadedifrom the source systems  No data is rejected.  At the leaf level, data is stored in an untransformed or in an untransformed state. A DL can be accessed by multiple people and ingests and stores a variety of data kinds. Numerous problems with accessing, searching, and analysing data may arise if properimanagement techniques are not in place [2]. Governance for data lakes is necessary in this situation. To uncover some partial solutions in the cutting-edge work. [6] suggests a "Just-enough Governance" for DLs with policies for data quality, data on-boarding, metadata management, andicompliance & audit. According to [4], data governance must guarantee data availability and quality throughout the
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2242 whole data life-cycle.[5]provideddata governance with data life-cycle management,idata lineage, data quality, and security. 2.1 Data Lake (Architectural Implementation) The architecture of a Data Lake can be divided into five layers : 1. IngestioniLayer 2. DistillationiLayer 3. Processing Layer 4. InsightsiLayer 5. Unified OperationsiLayer Ingestion of Raw Data into the Data Lake is the goal of the Data Lake Architecture's Ingestion Layer. In this layer, there is no data alteration. The layer can take in Raw Data in batches or in real-time, and itiorganises the data into a logical folder structure. The Ingestion Layer can retrieve data from a variety of external sources, including social networkingiwebsites, wearable technology, Internet of Things (IoT) devices, and data streaming equipment. This layer's advantage is that it can swiftly assimilate any kind of data, such as:  Streams of video from surveillance cameras.  Health monitoring equipment data in real time.  Many telemetry data types.  Geographical information, pictures, and videos taken with mobile devices. Thisisecond phaseentailsstrengtheningthecapacityfordata transformation andanalysis.Companiesemploythetool that best suits their skill set at this point. More data are being collected, and applications are being developed. The enterprise data warehouse and data lake's capabilities are combined here. In thirdilayer, data can be kept in files or tables after being processed into usable data sets. At this point, both the data's purpose and its structure are known. Before this layer, you should anticipate purging and transformations. Denormalization and consolidation of various objects are also frequent. This is the most complicated element of the entire Data Lake solution because of everything mentioned before. The framework is pretty plain and easy to understand in terms of how to organise your data. For instance: Files, Type, and Purpose. End users typically only have access to this tier. Theifourth layer consists of the Data Lake's query interface and output interface. It makes requests for or retrieves data from the Data Lake using SQL and NoSQL queries. Typically, enterprise users who require access to the data run the queries. The layer that displays the data to the user for viewing after it has been fetchedfromtheData Lake.Reports and dashboards are frequently the results of searches, and they make it simple for users to draw conclusions from theiunderlying data. Speaking of the final phase of general data flow ina data lake design is the consumption zone. Through the analytic consumption tools and SQL and non-SQL queryicapabilities at this layer, the findings and business insights fromanalytic projects are made available to the targeted users, be they technical decision-makers or business analysts. 2.2 Proposal for Improvement of Data Lake Because a complete investigation has not been performed, analysts in data lakes sometimes struggle to assess the qualityiof the data. Additionally, since there is no account of the history of conclusions made by earlier analysts, it is impossible to incorporate insights from others who have worked with the data. Finally,securityandaccesscontrol are two of the main dangers associated with data lakes. Without any control, data can be thrown into a lake, some of which may have privacy and legal restrictions that other dataidoes not. The Swamp Problem : Aimassive data lake will accept any data you offer it. Unless all the data in your present systems is flawless and of the greatest calibre possible: there are no formatting errors, no missing values, no typos, and no data ever placed into the wrong field. Additionally, theiinformation in all of your present systems is flawlessly consistent with one another. Then there won't be an issue. Solution : There is more to a data lake than just having one. The goal isito transform data into knowledge, knowledge into action, and knowledge into action—all guided by the individuals who are most familiar with your data. Metadata management is how the tale of the data by supplying context and explanations about the data's origin, usefulness, quality,andsignificance.Evenwith relativelytiny volumes of data, most organisations stillihave trouble managing metadata. Machine learning has the potential to change the game because it can extract tacit knowledge from those who understand data the best and transform it into algorithms that can be utilised to scale-up automated data processing.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2243 The developmentiof data quality systems is the topic of another type of frameworks. In order to understand best practises in mature businesses and pinpoint areas for growth, they strive to evaluatetheDQmanagementmaturity level. Popular examples of these frameworks include CapabilityiMaturity Model Integration (CMMI) and Total DataiQuality Management (TDQM) (CMMI),etc. An efficientidata quality model gives business data integrity actions direction so that they are all built upon the same comprehensiveunderstanding.Itlessenscomplexityiwithout limiting the ability to iterate and evolveascomprehensionof the data advances. 3. COMPARISON BETWEEN DATA LAKE & DATA WAREHOUSE Both data lakes and data warehouses are often used for holding huge data, however the phrases are not exchangeable. A data lake isa largepool ofunstructureddata with no clear purpose. A data warehouse is a storage location for organized, filtered data that has previouslybeen processed for a particular purpose. The two methodsofdata storage are frequently mistaken, yet they are far more dissimilar than they are similar. The only true resemblance between them is the high-level purpose of data storage. The distinction is significant since they perform various functions and need the employment of separate sets of eyes to be adequately maximized. A data lake may be appropriate for one firm, whereas a data warehouse may be more appropriate for another. The main difference between them are as follows:  iData structure: iiraw vs. iiprocessed: Most of the data in data lakes is unprocessed andraw.Unprocesseddata for a purpose is known as raw data. Raw data is convenient for machine learning and is simple to examine. Whereas, data warehouses, hold processed data. In contrast to raw data, processed data is easily comprehended by a wide number of individuals. It also saves money on costly storage space. The concern is that a big volume of data may occasionally turn into data swamps, with some data never being utilised.  iPurpose: iUndetermined vs iiOperational:Compared to data warehouses, data lakes contain less structure and filtration of the data. It is referred to asprocessed data whenever the raw data is used for a certain purpose. This indicates that the data that is stored is not squandered and will be put to use inside the company.  iUsers: Dataiiscientists vs business iiprofessionals: People who are not used to working with raw data frequently find it challenging to exploredata lakes.To comprehend and transform raw, unstructured data for any specific commercial application, it often takes a data scientist and special equipment. To make processed data readable by the majority, if not all, of the personnel at a corporation, it is utilized in charts, spreadsheets, tables, and other formats. Processed data, such as that kept in data warehouses, simply needs the user to be knowledgeable about thesubject matter.  iAccessibility: iiFlexible vs iisecure: The lack of structure in data lake design makes it flexible and simple to modify. Data lakes have extremely minimal restrictions, thus any modifications to the data may be made fast. Data warehouses are highly organizedibyidesign, which makes it expensive and harder to alter them. The fact that data is processed and structured in a way that makes it easier to understand is one of the main benefits of data warehouse design. 4. CHALLENGES OF DATA LAKE It may be quite expensive to set up and manage datalakes. Although managedicloud data platforms are simpler to implement, they are still challenging to administeriandhave high costs. Data lakes may be challenging to maintain even for experienced engineers. Whether a basiciopen-source cloud data platform or a managed service is utilized, host infrastructure has the bandwidth for the data lake to keep expanding, dealingiwith duplicateidata is made sure, safeguarding all of the data, and other suchichores are all challenging jobs. Traditional cloud data platformsiare effective at storing data but not so good at safeguarding it or enforcing data governancepolicies.Thathavetoaddsecurity and governance. That means extraitime, money, and managementiheadaches. All members of the organization have access to the data lake. Its practical use, however,isnot equally available to everyone. Since the data lake also contains unstructured data, non-technicaliusers may find it challenging to understand the data. 5. DISCUSSION AND CONCLUSION The data lake is a data storage depository that provides companies with a strong potential due to its capacity to handle massive amounts of all sorts of data. This ability should be treated seriously in order to avoid a data swamp of outdated, streaming, and forgotten information. Theideal method to enhance the benefits of this repository type is to choose a solution in which the stream goes both ways: the group's data is streamed in to consolidate, backup, and protect from a single central hub, while AI application enables for an overflow of transparency and business insights. Data Lake is a relatively new notion that is expanding in parallel with the popularity of cloud, data science, and artificial intelligence applications. It has acquired popularity in the market because to its adaptable architecture and the application or data type it supports, which enables businesses to gather a comprehensive
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2244 pictureiof patterns in data. Data lake solutions are useful for record keeping because they offer enterprises with a centralised location to backup and secure their data. Each data element in the lake is labelled with a set of metadata tags, allowing users to derive its most information from whatever is saved. Rather than having separate data collections, all data to be stored is collected in DL to address the old issue. The new problem is dealing with the difficulties of the big data period, i.e., data lakes attempt to overcome the difficulties demanded by big data V's - value, volume, variety, verity, andvelocity.Data silosareverylikely if data produced or developed by various departments within the enterprise is only stored in their data stores.Data Lake is a relatively new notion that is expanding in parallel with the popularity of cloud, data science, and artificial intelligence applications. It has acquired popularity in the market because to its adaptable architecture and the application or data type it supports, which enables businesses to gather a comprehensive picture of patterns in data. ACKNOWLEDGEMENT This work would not have been possible without the contribution of Smitha G R, AssistantProfessor-RVCollegeof Engineering, Aggarwal Akash, Data Analytics Lead-Koch Business Solutions. I am very thankful for all the help they provided throughout the course of this work. REFERENCES [1] Brian Stein, Alan Morrison,” The enterprise data lake: Better integration and deeper analytics, Technology Forecast: Rethinking integration”, Issue 1, 2014, Retrieved 25, Aug. 2017. [2] Timothy King “The Emergence of Data Lake: Pros and Cons”, March 3, 2016, Retrieved Sep 15, 2017. [3] Timothy King “The Emergence of Data Lake: Pros and Cons”, March 3, 2016, Retrieved Sep 15, 2017. [4] Huang Fang, Managing Data Lakes in Big Data Era: What’s a data lake and why has it became popular in data management ecosystem, The 5th Annual IEEE International Conference on Cyber Technology in Automation, Control and Intelligent Systems, June 8-12, 2015, Shenyang, China. [5] Huang Fang, Managing Data Lakes in Big Data Era: What’s a data lake and why has it became popular in data management ecosystem, The 5th Annual IEEE International Conference on Cyber Technology in Automation, Control and Intelligent Systems, June 8-12, 2015, Shenyang, China. [6] Huang Fang, Managing Data Lakes in Big Data Era: What’s a data lake and why has it became popular in data management ecosystem, The 5th Annual IEEE International Conference on Cyber Technology in Automation, Control and Intelligent Systems, June 8-12, 2015, Shenyang, China. [7] Huang Fang, Managing Data Lakes in Big Data Era: What’s a data lake and why has it became popular in data management ecosystem, The 5th Annual IEEE International Conference on Cyber Technology in Automation, Control and Intelligent Systems, June 8-12, 2015, Shenyang, China.