SlideShare a Scribd company logo
1 of 3
Download to read offline
| |july 2014
1CIOReview| |October 2015
1CIOReview
T h e N a v i g a t o r f o r E n t e r p r i s e S o l u t i o n s
OCTOBER - 30 - 2015 CIOREVIEW.COM
BIGDATA SPECIAL
AnDrE FuETSCh,
SVP,
AT&T
In My Opinion:
rAnDy SLOAn,
SVP & CIO,
SOuThwEST AIrLInES
CIO Insights:
Smarter
Decisions through
Big Data
TransUnion:
Jim Peck,
CEO & President
Company of the Month:
Ty Moser, Founder,
President & CEO,
Moser Consulting
| |july 2014
28CIOReview
| |October 2015
44CIOReview
CXO
INSIGHTS
T
he preservation of human
knowledge is of paramount
importance to progress,
now and in the future. And
because the vast majority of new data
is stored digitally, the need for reliable
digital storage is greater than ever. The
challenge today is ensuring that the
drives mass-produced by the storage
industry in order to keep up with the
ever-growing need for data storage are
manufactured according to the highest
standards of quality. The solution to that
challenge may lie in a relatively new but
fast-growing field known as Big Data
Analytics.
The need for reliable data storage
is particularly urgent in light of the fact
that the amount of data stored every year
is increasing rapidly. Indeed, much more
data is generated than is actually stored.
For example, CERN generates close to
a petabyte of data every second while
particles fired around the Large Hadron
Collider at velocities approaching the
speed of light are smashed together. But
CERN can only store approximately 25
PB of this data every year—equivalent
to about 8,333 full 3 TB hard disk drives.
When a disk drive is manufactured it
acts as an intelligent sensor that is aware
of its own health and quality, and it
stores its own sensor logs. These drives
are tested for many days, and during that
time, they might generate megabytes of
test, diagnostic, and configuration data
— as many as a 1,000 variables logged
for each drive. In addition, information
is collected about every important
component going into each drive, how
these components are combined, where
and when each component and each
drive was built, which firmware is used,
which customer it goes to, and many
other pieces of information.
The resulting combination
of parameters, attributes and
measurements can result in hundreds
of thousands of combinations and
resulting interdepencies. Analyzing
these combinations alone and together
requires new ways, new tools and new
ideas in order to separate key signals
or information from noise. There are
so many variables and parameters that
affect drive quality, reliability, and
performance that no traditional data
analysis approach can easily work on the
data generated and collected during the
manufacturing process.
Using Big Data Analytics
to produce high-quality
Big Data Storage
By Andrei Khurshudov, Chief Technologist, Seagate, Mark Brewer, SVP and CIO, Seagate Technology, Michael Crump,
VP of Quality, Seagate Technology
Mark Brewer
Andrei Khurshudov
| |july 2014
29CIOReview
| |October 2015
45CIOReview
How do we address this drive quality and reliability
challenge? Through Big Data Analytics, which combine such
techniques as advanced statistics and machine learning with
large amounts of data to extract those answers that are not
visible to more traditional analytics, operating with smaller
data set. With so much data available, using Big Data Analytics
can help control product quality and troubleshoot issues as
quickly as possible.
The first thing we need in order to implement Big Data
Analytics that ensures magnetic hard drive reliability is a
robust, coherent, end-to-end data collection process which
captures everything that could be important, and offers it for
further analysis. This data will be available when it’s needed,
and found where it’s expected. And it’s coherent in the sense
that all those pieces of data can be matched together as needed.
Ahard drive will be subject to this process starting from the time
and place where each main complement is “born” to the drive
factory, through the assembly lines, days of configuration and
testing, to the customer who is using them to build computers
or storage systems, to the end user, all the way to the end of
the drive’s life.
Second, we need storage infrastructure and an ecosystem
that lends itself to Big Data Analytics and complex data
mining. That means that a more traditional Enterprise Data
Warehouse architecture running relational databases should
be complemented by (and linked to) solutions designed for
distributed analytics and parallel computing, providing a
modern ecosystem with Hadoop / Spark capabilities, no-SQL
databases (such as MongoDB and Cassandra), and the ability
to store all data possible (both structured and unstructured),
and access it in parallel for better performance.
Third, we need trained personnel using Big Data Analytics
algorithms and solutions: true Data Scientists capable of
working with extremely large data sets using the most advanced
machine-learning techniques, and seamlessly linking all the
best programming environments and languages, machine-
learning libraries, and elements of a highly-distributed storage
and analytics ecosystem together. Together they can understand
the complex data generated through testing, and guarantee the
best product quality, reliability, and performance possible.
This is the approach that Seagate has implemented, and it
has already resulted in a dramatic improvement in the quality
of our products—which means more data can be preserved to
retrieve and use in the future.
Modern challenges require modern approaches. Making
highly reliable devices to store all of the data generated
in today’s world, mass-producing these devices in tens of
millions per quarter, becomes impossible without Big Data
Analytics and Machine Learning technology. These are now a
requirement for any leading high-volume technology company
in the 21th century.
Seagate's reputation for
quality and reliability
in its products is driven
by our manufacturing
excellence and supply
chain efficiency
Michael Crump

More Related Content

What's hot

Big, small or just complex data?
Big, small or just complex data?Big, small or just complex data?
Big, small or just complex data?panoratio
 
El big data analytics donde menos te lo esperas - Alex Rayón
El big data analytics donde menos te lo esperas - Alex RayónEl big data analytics donde menos te lo esperas - Alex Rayón
El big data analytics donde menos te lo esperas - Alex RayónBig-Data-Summit
 
Big Data and Health Care
Big Data and Health CareBig Data and Health Care
Big Data and Health CareJeffrey Funk
 
Societal Impact of Applied Data Science on the Big Data Stack
Societal Impact of Applied Data Science on the Big Data StackSocietal Impact of Applied Data Science on the Big Data Stack
Societal Impact of Applied Data Science on the Big Data StackStealth Project
 
Big Data: Are you ready for it? Can you handle it?
Big Data: Are you ready for it? Can you handle it? Big Data: Are you ready for it? Can you handle it?
Big Data: Are you ready for it? Can you handle it? ScaleFocus
 
7 Big Data Challenges and How to Overcome Them
7 Big Data Challenges and How to Overcome Them7 Big Data Challenges and How to Overcome Them
7 Big Data Challenges and How to Overcome ThemQubole
 
Full-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamFull-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamGreg Goltsov
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBernard Marr
 
Big Data Introduction
Big Data IntroductionBig Data Introduction
Big Data IntroductionTiago Knoch
 

What's hot (14)

Big Data – Are You Ready?
Big Data – Are You Ready?Big Data – Are You Ready?
Big Data – Are You Ready?
 
Big, small or just complex data?
Big, small or just complex data?Big, small or just complex data?
Big, small or just complex data?
 
Big data
Big dataBig data
Big data
 
Introduction to BigData
Introduction to BigData Introduction to BigData
Introduction to BigData
 
Big Data
Big DataBig Data
Big Data
 
El big data analytics donde menos te lo esperas - Alex Rayón
El big data analytics donde menos te lo esperas - Alex RayónEl big data analytics donde menos te lo esperas - Alex Rayón
El big data analytics donde menos te lo esperas - Alex Rayón
 
Big Data and Health Care
Big Data and Health CareBig Data and Health Care
Big Data and Health Care
 
Societal Impact of Applied Data Science on the Big Data Stack
Societal Impact of Applied Data Science on the Big Data StackSocietal Impact of Applied Data Science on the Big Data Stack
Societal Impact of Applied Data Science on the Big Data Stack
 
Big Data: Are you ready for it? Can you handle it?
Big Data: Are you ready for it? Can you handle it? Big Data: Are you ready for it? Can you handle it?
Big Data: Are you ready for it? Can you handle it?
 
7 Big Data Challenges and How to Overcome Them
7 Big Data Challenges and How to Overcome Them7 Big Data Challenges and How to Overcome Them
7 Big Data Challenges and How to Overcome Them
 
Thilga
ThilgaThilga
Thilga
 
Full-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamFull-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data Team
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must Know
 
Big Data Introduction
Big Data IntroductionBig Data Introduction
Big Data Introduction
 

Viewers also liked

BigScrum - Scaling Teams to Programs
BigScrum - Scaling Teams to ProgramsBigScrum - Scaling Teams to Programs
BigScrum - Scaling Teams to ProgramsThinkLouder
 
IVI Workshop Kazak Investors In English
IVI Workshop Kazak Investors In EnglishIVI Workshop Kazak Investors In English
IVI Workshop Kazak Investors In EnglishThomas Nastas
 
Projetos de Salas Residenciais
Projetos de Salas ResidenciaisProjetos de Salas Residenciais
Projetos de Salas Residenciaismarthahuback
 
The Global Competition For Capital
The Global Competition For CapitalThe Global Competition For Capital
The Global Competition For CapitalThomas Nastas
 
William Kosar Training Contract Law in Rwanda
William Kosar Training Contract Law in RwandaWilliam Kosar Training Contract Law in Rwanda
William Kosar Training Contract Law in RwandaWilliam Kosar
 
Where is Your Social Brand? Cultivating a Strong Brand Across Web 1.0, Web 2....
Where is Your Social Brand? Cultivating a Strong Brand Across Web 1.0, Web 2....Where is Your Social Brand? Cultivating a Strong Brand Across Web 1.0, Web 2....
Where is Your Social Brand? Cultivating a Strong Brand Across Web 1.0, Web 2....Noesium Consulting
 
Innovation Benefits Realization for Industrial Research (Part-1)
Innovation Benefits Realization for Industrial Research (Part-1)Innovation Benefits Realization for Industrial Research (Part-1)
Innovation Benefits Realization for Industrial Research (Part-1)Iain Sanders
 
20090511 Manchester Biochemistry
20090511 Manchester Biochemistry20090511 Manchester Biochemistry
20090511 Manchester BiochemistryMichel Dumontier
 
"mettiamoci sempre dove si prende"
"mettiamoci sempre dove si prende""mettiamoci sempre dove si prende"
"mettiamoci sempre dove si prende"Denis Ferraretti
 
The business of consulting (handout)
The business of consulting (handout)The business of consulting (handout)
The business of consulting (handout)Ian Gotts
 
Shubhanken Presentation
Shubhanken PresentationShubhanken Presentation
Shubhanken Presentationguest7300c4
 
Reputation snapshot for the banking industry, 2012, final
Reputation snapshot for the banking industry, 2012, finalReputation snapshot for the banking industry, 2012, final
Reputation snapshot for the banking industry, 2012, finalDamjana Kocjanc
 

Viewers also liked (20)

BigScrum - Scaling Teams to Programs
BigScrum - Scaling Teams to ProgramsBigScrum - Scaling Teams to Programs
BigScrum - Scaling Teams to Programs
 
IVI Workshop Kazak Investors In English
IVI Workshop Kazak Investors In EnglishIVI Workshop Kazak Investors In English
IVI Workshop Kazak Investors In English
 
Projetos de Salas Residenciais
Projetos de Salas ResidenciaisProjetos de Salas Residenciais
Projetos de Salas Residenciais
 
The Global Competition For Capital
The Global Competition For CapitalThe Global Competition For Capital
The Global Competition For Capital
 
William Kosar Training Contract Law in Rwanda
William Kosar Training Contract Law in RwandaWilliam Kosar Training Contract Law in Rwanda
William Kosar Training Contract Law in Rwanda
 
Where is Your Social Brand? Cultivating a Strong Brand Across Web 1.0, Web 2....
Where is Your Social Brand? Cultivating a Strong Brand Across Web 1.0, Web 2....Where is Your Social Brand? Cultivating a Strong Brand Across Web 1.0, Web 2....
Where is Your Social Brand? Cultivating a Strong Brand Across Web 1.0, Web 2....
 
Gladneyfinal
GladneyfinalGladneyfinal
Gladneyfinal
 
Referaat 31 05 2011
Referaat 31 05 2011Referaat 31 05 2011
Referaat 31 05 2011
 
Innovation Benefits Realization for Industrial Research (Part-1)
Innovation Benefits Realization for Industrial Research (Part-1)Innovation Benefits Realization for Industrial Research (Part-1)
Innovation Benefits Realization for Industrial Research (Part-1)
 
Zas
ZasZas
Zas
 
Riskopecredience 10 05 Final_Version
Riskopecredience 10 05 Final_VersionRiskopecredience 10 05 Final_Version
Riskopecredience 10 05 Final_Version
 
20090511 Manchester Biochemistry
20090511 Manchester Biochemistry20090511 Manchester Biochemistry
20090511 Manchester Biochemistry
 
Force Majeure: a Time Bomb
Force Majeure: a Time BombForce Majeure: a Time Bomb
Force Majeure: a Time Bomb
 
"mettiamoci sempre dove si prende"
"mettiamoci sempre dove si prende""mettiamoci sempre dove si prende"
"mettiamoci sempre dove si prende"
 
The business of consulting (handout)
The business of consulting (handout)The business of consulting (handout)
The business of consulting (handout)
 
Shubhanken Presentation
Shubhanken PresentationShubhanken Presentation
Shubhanken Presentation
 
Reputation snapshot for the banking industry, 2012, final
Reputation snapshot for the banking industry, 2012, finalReputation snapshot for the banking industry, 2012, final
Reputation snapshot for the banking industry, 2012, final
 
Pazos De BorbéN
Pazos De BorbéNPazos De BorbéN
Pazos De BorbéN
 
Arai presentation
Arai presentationArai presentation
Arai presentation
 
Email etiquette
Email etiquetteEmail etiquette
Email etiquette
 

Similar to Using Big Data Analytics

How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)Denodo
 
Cisco_Big_Data_Webinar_At-A-Glance_ABSOLUTE_FINAL_VERSION
Cisco_Big_Data_Webinar_At-A-Glance_ABSOLUTE_FINAL_VERSIONCisco_Big_Data_Webinar_At-A-Glance_ABSOLUTE_FINAL_VERSION
Cisco_Big_Data_Webinar_At-A-Glance_ABSOLUTE_FINAL_VERSIONRenee Yao
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...Denodo
 
Ανδρέας Τσαγκάρης, 5th Digital Banking Forum
Ανδρέας Τσαγκάρης, 5th Digital Banking ForumΑνδρέας Τσαγκάρης, 5th Digital Banking Forum
Ανδρέας Τσαγκάρης, 5th Digital Banking ForumStarttech Ventures
 
The Path to Data and Analytics Modernization
The Path to Data and Analytics ModernizationThe Path to Data and Analytics Modernization
The Path to Data and Analytics ModernizationAnalytics8
 
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"MDS ap
 
Ab cs of big data
Ab cs of big dataAb cs of big data
Ab cs of big dataDigimark
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?Denodo
 
Big Data Analytics: Recent Achievements and New Challenges
Big Data Analytics: Recent Achievements and New ChallengesBig Data Analytics: Recent Achievements and New Challenges
Big Data Analytics: Recent Achievements and New ChallengesEditor IJCATR
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)Shahbaz Anjam
 
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...IT Support Engineer
 
The Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their UsageThe Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their UsageIRJET Journal
 
Chip ICT | Hgst storage brochure
Chip ICT | Hgst storage brochureChip ICT | Hgst storage brochure
Chip ICT | Hgst storage brochureMarco van der Hart
 
Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Denodo
 
How Analytics Has Changed in the Last 10 Years (and How It’s Staye.docx
How Analytics Has Changed in the Last 10 Years (and How It’s Staye.docxHow Analytics Has Changed in the Last 10 Years (and How It’s Staye.docx
How Analytics Has Changed in the Last 10 Years (and How It’s Staye.docxpooleavelina
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Denodo
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
 

Similar to Using Big Data Analytics (20)

Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)
 
Cisco_Big_Data_Webinar_At-A-Glance_ABSOLUTE_FINAL_VERSION
Cisco_Big_Data_Webinar_At-A-Glance_ABSOLUTE_FINAL_VERSIONCisco_Big_Data_Webinar_At-A-Glance_ABSOLUTE_FINAL_VERSION
Cisco_Big_Data_Webinar_At-A-Glance_ABSOLUTE_FINAL_VERSION
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
 
Ανδρέας Τσαγκάρης, 5th Digital Banking Forum
Ανδρέας Τσαγκάρης, 5th Digital Banking ForumΑνδρέας Τσαγκάρης, 5th Digital Banking Forum
Ανδρέας Τσαγκάρης, 5th Digital Banking Forum
 
The Path to Data and Analytics Modernization
The Path to Data and Analytics ModernizationThe Path to Data and Analytics Modernization
The Path to Data and Analytics Modernization
 
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
 
Ab cs of big data
Ab cs of big dataAb cs of big data
Ab cs of big data
 
The ABCs of Big Data
The ABCs of Big DataThe ABCs of Big Data
The ABCs of Big Data
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
 
Big Data Analytics: Recent Achievements and New Challenges
Big Data Analytics: Recent Achievements and New ChallengesBig Data Analytics: Recent Achievements and New Challenges
Big Data Analytics: Recent Achievements and New Challenges
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)
 
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
 
The Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their UsageThe Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their Usage
 
Chip ICT | Hgst storage brochure
Chip ICT | Hgst storage brochureChip ICT | Hgst storage brochure
Chip ICT | Hgst storage brochure
 
Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)
 
How Analytics Has Changed in the Last 10 Years (and How It’s Staye.docx
How Analytics Has Changed in the Last 10 Years (and How It’s Staye.docxHow Analytics Has Changed in the Last 10 Years (and How It’s Staye.docx
How Analytics Has Changed in the Last 10 Years (and How It’s Staye.docx
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 

More from Andrei Khurshudov

Hyper-Converged Infrastructure: Big Data and IoT opportunities and challenges...
Hyper-Converged Infrastructure: Big Data and IoT opportunities and challenges...Hyper-Converged Infrastructure: Big Data and IoT opportunities and challenges...
Hyper-Converged Infrastructure: Big Data and IoT opportunities and challenges...Andrei Khurshudov
 
Short introduction to Big Data Analytics, the Internet of Things, and their s...
Short introduction to Big Data Analytics, the Internet of Things, and their s...Short introduction to Big Data Analytics, the Internet of Things, and their s...
Short introduction to Big Data Analytics, the Internet of Things, and their s...Andrei Khurshudov
 
Health monitoring & predictive analytics to lower the TCO in a datacenter
Health monitoring & predictive analytics to lower the TCO in a datacenterHealth monitoring & predictive analytics to lower the TCO in a datacenter
Health monitoring & predictive analytics to lower the TCO in a datacenterAndrei Khurshudov
 
clusterstor-hadoop-data-sheet
clusterstor-hadoop-data-sheetclusterstor-hadoop-data-sheet
clusterstor-hadoop-data-sheetAndrei Khurshudov
 
Future Information Growth And Storage Device Reliability 2007
Future Information Growth And Storage Device Reliability 2007Future Information Growth And Storage Device Reliability 2007
Future Information Growth And Storage Device Reliability 2007Andrei Khurshudov
 
Reliability Of Solid State Drives 2008
Reliability Of Solid State Drives 2008Reliability Of Solid State Drives 2008
Reliability Of Solid State Drives 2008Andrei Khurshudov
 

More from Andrei Khurshudov (9)

Hyper-Converged Infrastructure: Big Data and IoT opportunities and challenges...
Hyper-Converged Infrastructure: Big Data and IoT opportunities and challenges...Hyper-Converged Infrastructure: Big Data and IoT opportunities and challenges...
Hyper-Converged Infrastructure: Big Data and IoT opportunities and challenges...
 
Short introduction to Big Data Analytics, the Internet of Things, and their s...
Short introduction to Big Data Analytics, the Internet of Things, and their s...Short introduction to Big Data Analytics, the Internet of Things, and their s...
Short introduction to Big Data Analytics, the Internet of Things, and their s...
 
Seagate_1
Seagate_1Seagate_1
Seagate_1
 
Health monitoring & predictive analytics to lower the TCO in a datacenter
Health monitoring & predictive analytics to lower the TCO in a datacenterHealth monitoring & predictive analytics to lower the TCO in a datacenter
Health monitoring & predictive analytics to lower the TCO in a datacenter
 
Presentation_Final
Presentation_FinalPresentation_Final
Presentation_Final
 
clusterstor-hadoop-data-sheet
clusterstor-hadoop-data-sheetclusterstor-hadoop-data-sheet
clusterstor-hadoop-data-sheet
 
Long Term Data Storage 2007
Long Term Data Storage 2007Long Term Data Storage 2007
Long Term Data Storage 2007
 
Future Information Growth And Storage Device Reliability 2007
Future Information Growth And Storage Device Reliability 2007Future Information Growth And Storage Device Reliability 2007
Future Information Growth And Storage Device Reliability 2007
 
Reliability Of Solid State Drives 2008
Reliability Of Solid State Drives 2008Reliability Of Solid State Drives 2008
Reliability Of Solid State Drives 2008
 

Using Big Data Analytics

  • 1. | |july 2014 1CIOReview| |October 2015 1CIOReview T h e N a v i g a t o r f o r E n t e r p r i s e S o l u t i o n s OCTOBER - 30 - 2015 CIOREVIEW.COM BIGDATA SPECIAL AnDrE FuETSCh, SVP, AT&T In My Opinion: rAnDy SLOAn, SVP & CIO, SOuThwEST AIrLInES CIO Insights: Smarter Decisions through Big Data TransUnion: Jim Peck, CEO & President Company of the Month: Ty Moser, Founder, President & CEO, Moser Consulting
  • 2. | |july 2014 28CIOReview | |October 2015 44CIOReview CXO INSIGHTS T he preservation of human knowledge is of paramount importance to progress, now and in the future. And because the vast majority of new data is stored digitally, the need for reliable digital storage is greater than ever. The challenge today is ensuring that the drives mass-produced by the storage industry in order to keep up with the ever-growing need for data storage are manufactured according to the highest standards of quality. The solution to that challenge may lie in a relatively new but fast-growing field known as Big Data Analytics. The need for reliable data storage is particularly urgent in light of the fact that the amount of data stored every year is increasing rapidly. Indeed, much more data is generated than is actually stored. For example, CERN generates close to a petabyte of data every second while particles fired around the Large Hadron Collider at velocities approaching the speed of light are smashed together. But CERN can only store approximately 25 PB of this data every year—equivalent to about 8,333 full 3 TB hard disk drives. When a disk drive is manufactured it acts as an intelligent sensor that is aware of its own health and quality, and it stores its own sensor logs. These drives are tested for many days, and during that time, they might generate megabytes of test, diagnostic, and configuration data — as many as a 1,000 variables logged for each drive. In addition, information is collected about every important component going into each drive, how these components are combined, where and when each component and each drive was built, which firmware is used, which customer it goes to, and many other pieces of information. The resulting combination of parameters, attributes and measurements can result in hundreds of thousands of combinations and resulting interdepencies. Analyzing these combinations alone and together requires new ways, new tools and new ideas in order to separate key signals or information from noise. There are so many variables and parameters that affect drive quality, reliability, and performance that no traditional data analysis approach can easily work on the data generated and collected during the manufacturing process. Using Big Data Analytics to produce high-quality Big Data Storage By Andrei Khurshudov, Chief Technologist, Seagate, Mark Brewer, SVP and CIO, Seagate Technology, Michael Crump, VP of Quality, Seagate Technology Mark Brewer Andrei Khurshudov
  • 3. | |july 2014 29CIOReview | |October 2015 45CIOReview How do we address this drive quality and reliability challenge? Through Big Data Analytics, which combine such techniques as advanced statistics and machine learning with large amounts of data to extract those answers that are not visible to more traditional analytics, operating with smaller data set. With so much data available, using Big Data Analytics can help control product quality and troubleshoot issues as quickly as possible. The first thing we need in order to implement Big Data Analytics that ensures magnetic hard drive reliability is a robust, coherent, end-to-end data collection process which captures everything that could be important, and offers it for further analysis. This data will be available when it’s needed, and found where it’s expected. And it’s coherent in the sense that all those pieces of data can be matched together as needed. Ahard drive will be subject to this process starting from the time and place where each main complement is “born” to the drive factory, through the assembly lines, days of configuration and testing, to the customer who is using them to build computers or storage systems, to the end user, all the way to the end of the drive’s life. Second, we need storage infrastructure and an ecosystem that lends itself to Big Data Analytics and complex data mining. That means that a more traditional Enterprise Data Warehouse architecture running relational databases should be complemented by (and linked to) solutions designed for distributed analytics and parallel computing, providing a modern ecosystem with Hadoop / Spark capabilities, no-SQL databases (such as MongoDB and Cassandra), and the ability to store all data possible (both structured and unstructured), and access it in parallel for better performance. Third, we need trained personnel using Big Data Analytics algorithms and solutions: true Data Scientists capable of working with extremely large data sets using the most advanced machine-learning techniques, and seamlessly linking all the best programming environments and languages, machine- learning libraries, and elements of a highly-distributed storage and analytics ecosystem together. Together they can understand the complex data generated through testing, and guarantee the best product quality, reliability, and performance possible. This is the approach that Seagate has implemented, and it has already resulted in a dramatic improvement in the quality of our products—which means more data can be preserved to retrieve and use in the future. Modern challenges require modern approaches. Making highly reliable devices to store all of the data generated in today’s world, mass-producing these devices in tens of millions per quarter, becomes impossible without Big Data Analytics and Machine Learning technology. These are now a requirement for any leading high-volume technology company in the 21th century. Seagate's reputation for quality and reliability in its products is driven by our manufacturing excellence and supply chain efficiency Michael Crump