SlideShare a Scribd company logo
1 of 41
IT6701 – Information Management
Unit V – Information Lifecycle Management
By
Kaviya.P, AP/IT
Kamaraj College of Engineering & Technology
1
Unit V – Information Lifecycle Management
Data retention policies; Confidential and
Sensitive data handling, lifecycle management
costs. Archive data using Hadoop; Testing and
delivering big data applications for performance
and functionality; Challenges with data
administration
2
Data Retention Policies
What is Data Retention Policies?
• A document retention policy provides for the systematic review, retention and
destruction of documents received or created in the course of business.
• A document retention policy will identify documents that need to be maintained
and contain guidelines for how long certain documents should be kept and how
they should be destroyed.
Purpose of Data Retention Policies
• To maintain important records and documents for future use or reference.
• To dispose of records or documents that are no longer needed.
• To organize records so that they can be searched and accessed easily at a later
date.
3
Data Retention Policies
Categories of Requirements
• Legal or Legitimate requirements: The compliance or legal aspect, where a
certain legal case is filed and some piece of information need to be produced in a
court of law.
• Business or Commercial requirements: To make information available from the
operation’s perspective.
• Personal or Private requirements: To make information available from the
personal perspective.
4
Data Retention Policies
Scope : Categories of Document (What documents must be protected?)
• Legal Records: It include all the legal records, contracts, trademark, power of attorney,
press release, etc. These are the first set of documents that should be considered for
retention.
• Final Records: Documents not requiring ad hoc modification or alteration. They can
also specify records of completed activities.
• Permanent Records: Include all the business documents that describe the
organization’s details. They can also comprise of contracts, financial registers,
copyrights, patents, proposals.
• Accounting and Corporate Tax Records: Consists of financial statements,
investments, audits, tax returns, purchase, sales records, etc.
5
Data Retention Policies
Scope : Categories of Document (What documents must be protected?)
• Workplace Records: Information about the day-to-day activities of employees,
agreements, minutes of meetings, bylaws, etc.
• Employment, Employee, and Payroll Records: Include job postings, job
advertisements, recruitment procedures, performance reviews, etc.
• Bank Records: Information about bank transactions, deposits, cheque details, stop
payment, check bouncing.
• Historic Records: Records that are no longer required by the organization.
• Temporary Records: Documents that are not completed or finalized.
6
Data Retention Policies
Data Retention Policy
• When developing a retention policy, it is important to focus on the reason behind data
retention.
• The decision is based creation date, and include other criteria such as last access time, type of
data, time till which data is valid, data value, etc.
• The policy document should include details of the data/document that needs to be retained.
• The data should be divided into various categories such as personal employee data, client
data, financial data, legal data, etc.
• This division would help in deciding the duration of retention and destruction procedures.
• When the data retention period is over, the data should be discarded.
7
Data Retention Policies
Why to have Data Retention Policies?
The policy is also helpful to:
• Provide a system for complying with document retention laws
• Ensure that valuable documents are available when needed
• Save money, space and time
• Protect against allegations of selective document destruction, and
• Provide for the routine destruction of non-business, superfluous and outdated
documents
8
Data Retention Policies
Why to have Data Retention Policies?
The six most important reasons why an organization should implement a document
retention policy are:
1. To comply with legal duties and requirements, either statutory or regulatory
2. To avoid liability through “spoliation” the improper destruction or alteration of
documents in a litigation situation
3. To support or oppose a position in an investigation or litigation
4. To protect from unnecessary expense and time during discovery
5. To maintain control over discovery and e-discovery, and
6. To keep documents confidential and avoid leakage to attackers or competitors
9
Data Retention Policies
Laws Related to Data Retention Policy - India
• In India there is no Central Act which laid down the provisions related to Data Retention
Laws.
• But there are different policies incorporated by various agencies and which maintain and
follows their policies.
• Eg 1: Government of India Central Vigilance Commission by their wide notification
no. No.17/09/2006-Admn. gives the provisions related to Retention period/destruction
schedule of recorded files.
• Eg 2: The Ministry of Finance - Financial intelligence Unit has its own policy.
Notification No. 9/2005 - gives the “rules for Record Keeping and Reporting”.
10
Data Retention Policies
Laws Related to Data Retention Policy - India
• Rule 6. Retention of records - The records referred to in rule 3 shall be
maintained for a period of ten years from the date of cessation of the
transactions between the client and the banking company, financial institution or
intermediary, as the case may be.
• Thus, it may be noted that organization has its own Data retention Policies and
certain rules for retention of such records.
• However, there is no such established law wherein it is binding for the
organizations to prepare such policies.
11
Confidential and Sensitive Data Handling
Definition of Sensitive Data
• Data collected may be personal, confidential or sensitive in nature.
• Personal data provides information about an individual, and through which an
individual can be easily and uniquely identified, either directly or indirectly.
• Confidential data is the personal data that is private and should not be disclosed
to others.
12
Confidential and Sensitive Data Handling
Types of Sensitive Data
• Personal Information
– Sensitive personally identifiable information is data that can be traced back
to an individual, thus revealing one’s identity.
– Such information includes biometric data, medial information and history,
bank and credit card information, Passport or Aadhar numbers.
– Threats include not only crimes such as identity theft, but also disclosure of
personal information that the individual would prefer reminded private.
– Sensitive data should be encrypted both in transit and at rest.
13
Confidential and Sensitive Data Handling
Types of Sensitive Data
• Business Information
– Sensitive business information includes everything that poses a risk to the
company in question if discovered by a competitor or the general public.
– Such information includes trade secrets, contract details, acquisition plans,
financial data, supplier details, customer information.
– Methods of protecting corporate information from unauthorized access are
becoming integral to corporate security.
– These methods include deciding policy for security, metadata management
and document sanitization.
14
Confidential and Sensitive Data Handling
Types of Sensitive Data
• Classified Information
– It is pertains to a government body and is restricted according to the level of
sensitivity. (Eg: restricted, confidential, secret, and top secret)
– Information is generally classified to protect security.
– Once the risk of harm has passed or decreased, classified information may
be declassified and, possibly, made public.
15
Confidential and Sensitive Data Handling
Handling of Sensitive Data
• Sensitive data needs to be handled with utmost care with highest possible security
measures.
• Given a dataset, one or more attribute values in the tuple/record can be sensitive and
hence needs to be protected. But at the same time, other attributes of the same
tuple/record can be made available.
• Thus, the access policy needs to be defined at different granularity levels so that access
of these values for the attributes can be made available.
• Eg: If a query is triggered seeking information of all the patients having certain health
records, it should not reveal the identity of the individuals. Instead some aggregate
function can be applied like giving the total number of count of patients suffering from
the health condition.
16
Confidential and Sensitive Data Handling
Access Decision
• The database administrator decides what data should be in the database and who
should have access to it.
• These decisions are based on access policies that are defined in the
organization.
• Multiple factors are considered in making these polices such as availability of
data, acceptability of the access, authenticity of the user, etc.
17
Confidential and Sensitive Data Handling
Types of Disclosures
Sensitive data can be also be characterized based on what values are being disclosed.
• Displaying exact data: This is the most serious disclosure where the user will directly
get the sensitive data on request or sometimes without request; the latter being a serious
security concern.
• Displaying Bounds: Bounds are a convenient way of presenting sensitive data,
indicating that the sensitive value lies between high or low value. Eg: An organization
can reveal the range of salaries given to its managers, such that any person willing to
join the organization can take decision based on it.
18
Confidential and Sensitive Data Handling
Types of Disclosures
Sensitive data can be also be characterized based on what values are being disclosed.
• Displaying negative results: Sometimes a query could display a negative result,
specifying that a particular value is not present. This is of particular importance if the
data is of binary type and is represented as 0 or 1. Thus disclosing a value 0 is of
significant importance. However, in certain cases displaying information like whether a
student will appear in the top 10 list would not reveal significant information.
• Displaying probable values: Sometimes it maybe be possible to determine the
probability that a certain attribute will hold a particular value.
• Sensitive data can be secured by keeping it in an encrypted format so that the
information is not accidently revealed. But this can be tedious sometimes, if different
attributes need different levels of confidentiality.
19
Confidential and Sensitive Data Handling
Handling Data
1. Create a risk aware culture that includes an information security risk management
program. Define security and risk mitigation and handling policies at the enterprise
level.
2. Define data types used in the organization and classify it as confidential or sensitive.
3. Clarify responsibilities and accountability for the protection of confidential/sensitive
data.
4. Limit the access to confidential/sensitive data only to those absolutely essential to
institutional process.
5. Provide awareness and training to properly use the resources and follow the guidelines
and rules specified.
6. Authenticate compliance regularly with your policies and procedures.
20
Confidential and Sensitive Data Handling
Law provision in India Defining Sensitive Data and its Handling
Right to Information Act, 2005 gave a stimulus to transparency in government dealings and
concurrently provided some protection against the unwarranted disclosure of confidential
information under the law.
• A new civil provision prescribing damages for an entity that is negligent in using
“reasonable security practices and procedures” while handling “sensitive personal or
data information” resulting in wrongful loss or wrongful gain to any person.
• Criminal punishment for a person (a) if s/he discloses sensitive personal information;
(b) does so without the consent of the person or in breach of relevant contact and (c)
with an intention of ,or knowing that the disclosure would cause wrongful loss or gain.
• The IT rules introduced in 2011, defines “sensitive personal data” for the first time in
India.
21
Confidential and Sensitive Data Handling
Law provision in India Defining Sensitive Data and its Handling
The salient features of the new rules are as follows:
• Sensitive personal information: The laws relate to dealing with information generally,
personal information and “sensitive personal or data information”(SPD). SPD is defined to
cover the following : (a)passwords,(b)financial and credit information such as bank account or
credit card or debit card or other payment instrument details;(c)physical, physiological and
mental conditions ;(d) sexual orientation; (e) medical records and history and (f) biometric
and deoxyribonucleic acid(DNA) information. It may be noted that SPD deals with
information of individuals and not information of business.
• Privacy policy: Every business needs to have a privacy policy that must be published on its
website. Even if the business is not handling SPD, it is required to have a privacy policy. It
must describe what information is collected, what is the purpose of using the information, to
whom or how the information might be disclosed and the sound security practices followed to
safeguard the information.
22
Confidential and Sensitive Data Handling
Law provision in India Defining Sensitive Data and its Handling
The salient features of the new rules are as follows:
• Consent for collection: A business cannot collect SPD unless it obtains the prior
consent of the Information provider. The consent has to be provided by letter, fax or
email.
• Notification: The business should ensure that the information provider is aware
of the information being collected, the purpose of using the information, the
recipients of the information and the name and address of the agency collecting
the information.
• Use and Retention: The usage of personal information has to be restricted to
the purpose for which it was collected. The data retention rules have to be
followed in terms of maintaining the data for specified period as well as
destroying the data after that. The business should not maintain the SPD for
longer than it is specified.
23
Confidential and Sensitive Data Handling
Law provision in India Defining Sensitive Data and its Handling
The salient features of the new rules are as follows:
• Rights of access, correction and withdrawal: The business should permit the
information provider the right to review the information, and should ensure that
any information found to be inaccurate or deficient be corrected. The
information provider also has the right to withdraw its consent to the collection
and use of the information
• Transnational transfer: A business can only transfer the SPD or information to
a party overseas if the overseas party ensures the same level of protection
provided for under the Indian rules.
• Security procedures: The IT Act requires reasonable security procedures to be
maintained to escape liability. The security procedure has to be audited on a
regular basis by an independent auditor, approved by the Government of India.
24
Lifecycle Management Costs
• Data Lifecycle Management is the process of handling the flow of business
information throughout its lifespan, from requirements through maintenance.
• Information Lifecycle Management (ILM) is the consistent management of
information from creation to final disposition.
• It is comprised of strategy, process, and technology to effectively manage
information which, when combined, drives improved control over information in
the enterprise.
• It aims at automating the processes involved in organizing data into separate tiers
according to the specified policies, and automating data migration from one tier to
another tier.
• As a rule, newer data, and data that must be accessed more frequently, is stored on
faster, but more expensive storage media, while less critical data is stored on
cheaper, but slower media.
25
Lifecycle Management Costs
Benefits of Information Management Lifecycle
• Reduced Risk: Reduce unneeded and expired information, and make your information
easier to manage and discover.
• Cost Saving: eDiscovery, storage, and legal hold costs can be reduced with better
management of information.
• Improved Service: Archiving, eDiscovery, and Records Management may become less
of a distraction and drain on IT and Legal.
• Effective Governance: ILM can introduce management rigor and controls that benefit
the enterprise. ILM can bring the added bonus of improved management of information
for the entire business.
26
Lifecycle Management Costs
Five Stages of Data Lifecycle
• Data Creation
– When an employee or client creates and saves a file, that data becomes a part of the
organization’s daily operation.
– Enterprises often store this active data locally and on a network server while backing it
up on local storage appliances or cloud storage.
– This setup provides for fast recovery in case of data loss.
• Backup storage against data loss
– As the system’s efficiency increases, the enterprise can replicate the data from primary
storage into less costly off-site tape vaults or to the cloud.
– In case of a major outage or disaster, the data can be restored completely.
– The backup of the data and the amount of replication depends on the type and value of
the data.
27
Lifecycle Management Costs
Five Stages of Data Lifecycle
• Archiving helps contain storage costs
– Older inactive data that is not frequently handled can be retained in case of a legal, regulatory
or audit event.
– Various data storage networks can be used to archive the data, or data can be retained using
cloud or Hadoop.
– Offsite tapes offer high security, quick access, lower storage costs for such long-term data
storage demands.
– This kind of low-cost tape is particularly well suited to unstructured data such as Email.
• Ensuring secure data destruction
– The final stage of data lifecycle requires secure data destruction, which is typically governed
by a schedule that defines when and how you must destroy unwanted data.
– Once data reaches its expiration date, secure media destruction can ensure its environmentally
friendly disposal.
28
Lifecycle Management Costs
Five Stages of Data Lifecycle
• Put secure IT asset disposition to work
– The data storage lifecycle does not end until the last traces of data are destroyed –and this
includes information remaining within any obsolete hardware or peripherals.
– As with media destruction, maintain the chain of custody when eliminating any old computers
and office equipment.
Efficient Information Lifecycle Management
• For handling large amount of data, the storage needs to be scalable to accommodate it. Hence,
a flexible architecture should be considered for storage.
• Analytics application in some cases require us to access archived and unstructured data. To
leverage analytics, to make informed decision data can be archived into frameworks like
Hadoop.
• The storage can be optimized for maintenance and licensing costs by migrating rarely used
data into framework like Hadoop.
29
Lifecycle Management Costs
To proficiently manage data throughout its entire lifecycle, organizations must keep three
objectives in mind:
• Data veracity(trustworthiness) is critical for both analytics and regulatory compliance.
• Both structured and unstructured data must be managed effectively.
• Data privacy and security must be protected at all times.
30
Archive Data Using Hadoop
• The inexpensive cost of storage for Hadoop which supports to store any type of
data like structured , semi-structured or unstructured data plus the ability to query
Hadoop data using SQL commands.
• Hadoop utilizes commodity hardware and can be easily scaled up to
accommodate new data.
• Thus, the Hadoop environment can be used to archive and process the data.
• The Hadoop used to perform archiving is Sqoop, which can move the data to be
archived from the data warehouse into Hadoop.
• You will need to consider what form you want the data to take in your Hadoop
cluster. In general, compressed Hive files are a good option.
31
Archive Data Using Hadoop
• Archiving everything has an advantage of providing a single interface across the entire
dataset for issuing queries.
• Partial availability of data would require queries to be executed on the archived data
and the active data, and provide a merged solution of the two queries.
• An enterprise data warehouse archiving solution for Hadoop must provide three key features:
– Schema conversation: The archive must precisely duplicate the schema of the source
warehouse. It is essential to confirm that data values will be archived without loss of
precision. Changes to the source schema, for example, adding new columns or changing data
types, should also be captured by the archive.
– Control and security: The archive must provide access to data on a “need to know” basis; it
must guarantee that sensitive data is encrypted or masked, and that access is audited.
– Querying support: Support for SQL access to the archived data is essential. Applications
would require us to make use of the archived data to generate reports or to perform
analysis.
32
Testing and Delivering Big Data Applications for
Performance and Functionality
• Testing bid data application is more a verification of its data processing rather than testing
the individual features of the software product.
• When it comes to big data testing, performance and functional testing are the key
components to evaluate.
• The testing of Hadoop big data application can be performed as a two-step process.
– Checking the functionality: The business logic encoded using MapReduce programs
is tested in this phase. For this, unit testing can be performed and executed in the
pseudo-distributed mode.
– Checking on the cluster: Once the business logic is validated, it can be tested on the
cluster for the performance and failover. Performance testing includes testing of job
completion and the time taken, utilization of the memory and other resources, data
throughput, etc. Failover testing included failure of one or more daemons running in
Hadoop, namely, NameNode, DataNode, Resource Manager, Node Manager or failure of
the device through which the distributed environment is made available.
33
Testing and Delivering Big Data Applications for
Performance and Functionality
Testing big data applications have several challenges, which include the following:
• Automation: Support of automation tools for performing testing is not available. Thus,
automation in testing for big data requires someone with technical expertise. Also, automated
tools are not equipped to handle unexpected problems that arise during testing.
• Virtualization: Testing, especially unit testing, is usually performed in a virtual environment.
It is one of the fundamental phases of testing. Virtual machine latency creates timing
problems in real time big data testing. Also, managing images in big data is a hassle.
• Large dataset: The amount of data is huge and can have many variations. Further they can
originate from different sources, thus integrating data is a major challenge. Thus, more data
needs to be verified and this needs to be done at faster rate.
• Testing across platforms: Hadoop is a collection various tools. The applications can be
written using any of the tools. Thus, there is a need of tools that will enable testing across
different platforms.
• Monitoring and diagnostic solution: There are limited solutions that can monitor the entire
execution environment and detect bottleneck or failures. 34
Challenges with Data Administration
• The Data administrator is responsible for designing and maintaining data stores.
• Data administration is the method by which data is monitored, managed and
maintained by a person or an organisation.
• Data administration allows an organisation to check its data resources, along with their
processing and communications with different applications and business processes.
• Data Administrator needs to integrate data from multiple resources and provide it to
various applications.
• Data administrator deals with designing of the logical and conceptual models treating
the data at an organisational level whereas Database administrator deal with
implementation of databases required and in use.
35
Challenges with Data Administration
Responsibility of Data Administrator
1. Data Policies, Procedures, Standards
• Data administrator should set the data creation and handling policies which include details of
which application can interact with which data, how that data can be changed and what is the effect
of the change.
• Data Procedures are documented plan of actions to be taken to perform a certain activity like
backup and recovery procedures. Data administrator’s role is to ensure that these procedures are
defined and communicated to all concerned employees.
• Data Standards are unambiguous conventions and behaviours that need to be followed so that
the maintenance becomes easy. It can also be used to evaluate database quality.
2. Planning
• Effective administration of data requires an understanding of the organisations needs and the
ability to lead the development of an information architecture that will meet the diverse needs of
the organisation.
• Thus a data administrator needs to plan for an effective administration of data and also provide
support for future needs.
36
Challenges with Data Administration
Responsibility of Data Administrator
3. Data Conflict(ownership) Resolution
• Data stores are planned to be shared and usually involve data from several different departments of
the organisation.
• Ownership of data in a sensitive issue in every organisation.
• Data administrator should establish procedures for resolving any conflicts in ownership.
4. Managing the Data Repository
• Data Repositories contain metadata that holds data description of the data stored in data stores.
• They describe an organisations data and data processing resources.
• As the data stores are increasing in size and incorporating unstructured data, data repositories need
to be enhanced to incorporate new and unseen data.
5. Internal Marketing of DA Concepts
• For data administration to be effective, established policies and procedures must be made known
to the internal staff. These may reduce resistance to changes or ownership problems.
37
Challenges with Data Administration
Responsibility of Data Administrator
1. Designing the Database
• The administrator is responsible for defining and creating the logical data model, physical
database model and prototyping.
2. Security and Authorization
• The database administrator ensures that there is no unauthorized access to data. In general,
the data should not be accessible to everyone.
• In a database system, user may be granted permission to access only certain views and
relations.
• The administrator can enforce various authentication and authorization techniques through
which the access can be guaranteed only to specific entities.
• Authentication techniques will ensure that the person is an individual who is supposed to
access the data while authorization techniques decide what data has to be given access to.
38
Challenges with Data Administration
Responsibility of Data Administrator
3. Data Availability and Recovery from Failures
• The administrator makes sure that the data is available at all times.
• In case of database failure, the administrator should ensure that the data is made
available to its user in such a way that the users are unaware of the failure.
• The administrator also ensures that the data remains in a consistent state and
appropriate techniques to achieve these are implemented.
4. Database Tuning
• Data needs to be evolved with time as the users need change.
• The administrator should modify the structure or design of the database to incorporate
these changes.
• The DBA is responsible for modifying the database in particular the conceptual and
logical design.
39
Challenges with Data Administration
Challenges of Data Administrator
• Creating the Data Repository
– With huge amount of data flowing in from various sources, integrating it to create
a common data repository is challenging.
– This is further complicated since the data is in an unstructured format.
– Pre-processing is an important step in preparing the data for processing and
efficient techniques need to be developed.
• Evolving Nature of Data Consideration in Analysis
– A modern administrator is required to have an understanding of the vast domains
as organizations are now dealing with new types of data.
– Eg: A machine data is centrally logged and stored. For tracking the machines
performance its data needs to be understood well enough to gain insight from it even
if they do not possess the relevant technical background.
40
Challenges with Data Administration
Challenges of Data Administrator
• Emphasize the capability to build a database quickly, tune it for maximum
performance and restore it to production quickly when problems develop.
• Enforcing the data policies and standards especially those related to security.
• As the organizations needs are changing, efficient support should be provided to
incorporate the changes and make provision for future scope.
• Ownership criteria of the data in not restricted to the internal staff. With the social
media, it is tricky to define the ownership of data.
• The administrator is always expected to keep abreast with new technologies and is
usually involved in mission critical applications.
• Another challenging aspect is that data administrators are required to have a
comprehensive understanding of a wide variety of topics to understand and improve
business processes in their organization.
41

More Related Content

What's hot

Bmgt 311 chapter_5
Bmgt 311 chapter_5Bmgt 311 chapter_5
Bmgt 311 chapter_5Chris Lovett
 
Managing Data Strategically
Managing Data StrategicallyManaging Data Strategically
Managing Data StrategicallyMichael Findling
 
Data Protection by Design and Default for Learning Analytics
Data Protection by Design and Default for Learning AnalyticsData Protection by Design and Default for Learning Analytics
Data Protection by Design and Default for Learning AnalyticsTore Hoel
 
A survey on various architectures, models and methodologies for information r...
A survey on various architectures, models and methodologies for information r...A survey on various architectures, models and methodologies for information r...
A survey on various architectures, models and methodologies for information r...IAEME Publication
 
Characterizing and Processing of Big Data Using Data Mining Techniques
Characterizing and Processing of Big Data Using Data Mining TechniquesCharacterizing and Processing of Big Data Using Data Mining Techniques
Characterizing and Processing of Big Data Using Data Mining TechniquesIJTET Journal
 
Enterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge - Taxonomy Design Best Practices and MethodologyEnterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge - Taxonomy Design Best Practices and MethodologyEnterprise Knowledge
 
Best Practice Intelligence Portals for Telecommunication & High Tech Companie...
Best Practice Intelligence Portals for Telecommunication & High Tech Companie...Best Practice Intelligence Portals for Telecommunication & High Tech Companie...
Best Practice Intelligence Portals for Telecommunication & High Tech Companie...Comintelli
 
Maturing Your Organization's Information Risk Management Strategy
Maturing Your Organization's Information Risk Management StrategyMaturing Your Organization's Information Risk Management Strategy
Maturing Your Organization's Information Risk Management StrategyPrivacera
 
“Recognizing Value from a Shared RM/DM Repository: Canadian Government Perspe...
“Recognizing Value from a Shared RM/DM Repository: Canadian Government Perspe...“Recognizing Value from a Shared RM/DM Repository: Canadian Government Perspe...
“Recognizing Value from a Shared RM/DM Repository: Canadian Government Perspe...Cheryl McKinnon
 
Intranet for Library Services
Intranet for Library ServicesIntranet for Library Services
Intranet for Library ServicesBhojaraju Gunjal
 
Agent-SSSN: a strategic scanning system network based on multiagent intellige...
Agent-SSSN: a strategic scanning system network based on multiagent intellige...Agent-SSSN: a strategic scanning system network based on multiagent intellige...
Agent-SSSN: a strategic scanning system network based on multiagent intellige...IJERA Editor
 
km ppt neew one
km ppt neew onekm ppt neew one
km ppt neew oneSahil Jain
 
Information Architecture Primer - Integrating search,tagging, taxonomy and us...
Information Architecture Primer - Integrating search,tagging, taxonomy and us...Information Architecture Primer - Integrating search,tagging, taxonomy and us...
Information Architecture Primer - Integrating search,tagging, taxonomy and us...Dan Keldsen
 
Company Metadata and Master Data Management Unit 9 Assigment 1 Jessica Graf
Company Metadata and Master Data Management Unit 9 Assigment 1 Jessica GrafCompany Metadata and Master Data Management Unit 9 Assigment 1 Jessica Graf
Company Metadata and Master Data Management Unit 9 Assigment 1 Jessica GrafJessica Graf
 
Intelligence2day Product Tour
Intelligence2day Product TourIntelligence2day Product Tour
Intelligence2day Product TourComintelli
 
Data Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata HarmonisationData Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata HarmonisationAlan McSweeney
 
THE ROLE OF INFORMATION RESOURCES AND ENTERPRISE SYSTEM IN DYSON
THE ROLE OF INFORMATION RESOURCES AND ENTERPRISE SYSTEM IN DYSONTHE ROLE OF INFORMATION RESOURCES AND ENTERPRISE SYSTEM IN DYSON
THE ROLE OF INFORMATION RESOURCES AND ENTERPRISE SYSTEM IN DYSONsreeragtg
 
Challenges and emerging practices for knowledge organization in the electron...
Challenges and emerging practices for knowledge  organization in the electron...Challenges and emerging practices for knowledge  organization in the electron...
Challenges and emerging practices for knowledge organization in the electron...Anil Mishra
 

What's hot (18)

Bmgt 311 chapter_5
Bmgt 311 chapter_5Bmgt 311 chapter_5
Bmgt 311 chapter_5
 
Managing Data Strategically
Managing Data StrategicallyManaging Data Strategically
Managing Data Strategically
 
Data Protection by Design and Default for Learning Analytics
Data Protection by Design and Default for Learning AnalyticsData Protection by Design and Default for Learning Analytics
Data Protection by Design and Default for Learning Analytics
 
A survey on various architectures, models and methodologies for information r...
A survey on various architectures, models and methodologies for information r...A survey on various architectures, models and methodologies for information r...
A survey on various architectures, models and methodologies for information r...
 
Characterizing and Processing of Big Data Using Data Mining Techniques
Characterizing and Processing of Big Data Using Data Mining TechniquesCharacterizing and Processing of Big Data Using Data Mining Techniques
Characterizing and Processing of Big Data Using Data Mining Techniques
 
Enterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge - Taxonomy Design Best Practices and MethodologyEnterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge - Taxonomy Design Best Practices and Methodology
 
Best Practice Intelligence Portals for Telecommunication & High Tech Companie...
Best Practice Intelligence Portals for Telecommunication & High Tech Companie...Best Practice Intelligence Portals for Telecommunication & High Tech Companie...
Best Practice Intelligence Portals for Telecommunication & High Tech Companie...
 
Maturing Your Organization's Information Risk Management Strategy
Maturing Your Organization's Information Risk Management StrategyMaturing Your Organization's Information Risk Management Strategy
Maturing Your Organization's Information Risk Management Strategy
 
“Recognizing Value from a Shared RM/DM Repository: Canadian Government Perspe...
“Recognizing Value from a Shared RM/DM Repository: Canadian Government Perspe...“Recognizing Value from a Shared RM/DM Repository: Canadian Government Perspe...
“Recognizing Value from a Shared RM/DM Repository: Canadian Government Perspe...
 
Intranet for Library Services
Intranet for Library ServicesIntranet for Library Services
Intranet for Library Services
 
Agent-SSSN: a strategic scanning system network based on multiagent intellige...
Agent-SSSN: a strategic scanning system network based on multiagent intellige...Agent-SSSN: a strategic scanning system network based on multiagent intellige...
Agent-SSSN: a strategic scanning system network based on multiagent intellige...
 
km ppt neew one
km ppt neew onekm ppt neew one
km ppt neew one
 
Information Architecture Primer - Integrating search,tagging, taxonomy and us...
Information Architecture Primer - Integrating search,tagging, taxonomy and us...Information Architecture Primer - Integrating search,tagging, taxonomy and us...
Information Architecture Primer - Integrating search,tagging, taxonomy and us...
 
Company Metadata and Master Data Management Unit 9 Assigment 1 Jessica Graf
Company Metadata and Master Data Management Unit 9 Assigment 1 Jessica GrafCompany Metadata and Master Data Management Unit 9 Assigment 1 Jessica Graf
Company Metadata and Master Data Management Unit 9 Assigment 1 Jessica Graf
 
Intelligence2day Product Tour
Intelligence2day Product TourIntelligence2day Product Tour
Intelligence2day Product Tour
 
Data Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata HarmonisationData Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata Harmonisation
 
THE ROLE OF INFORMATION RESOURCES AND ENTERPRISE SYSTEM IN DYSON
THE ROLE OF INFORMATION RESOURCES AND ENTERPRISE SYSTEM IN DYSONTHE ROLE OF INFORMATION RESOURCES AND ENTERPRISE SYSTEM IN DYSON
THE ROLE OF INFORMATION RESOURCES AND ENTERPRISE SYSTEM IN DYSON
 
Challenges and emerging practices for knowledge organization in the electron...
Challenges and emerging practices for knowledge  organization in the electron...Challenges and emerging practices for knowledge  organization in the electron...
Challenges and emerging practices for knowledge organization in the electron...
 

Similar to IT6701 Information Management Unit - V

GDPR Breakfast Briefing - For Business Owners, HR Directors, Marketing Direct...
GDPR Breakfast Briefing - For Business Owners, HR Directors, Marketing Direct...GDPR Breakfast Briefing - For Business Owners, HR Directors, Marketing Direct...
GDPR Breakfast Briefing - For Business Owners, HR Directors, Marketing Direct...Harrison Clark Rickerbys
 
GDPR Breakfast Briefing for Business Advisors
GDPR Breakfast Briefing for Business AdvisorsGDPR Breakfast Briefing for Business Advisors
GDPR Breakfast Briefing for Business AdvisorsHarrison Clark Rickerbys
 
Privacy_Engineering_Privacy Assurance_Lecture-Ecole_Polytechnic_Nice_SA-20150127
Privacy_Engineering_Privacy Assurance_Lecture-Ecole_Polytechnic_Nice_SA-20150127Privacy_Engineering_Privacy Assurance_Lecture-Ecole_Polytechnic_Nice_SA-20150127
Privacy_Engineering_Privacy Assurance_Lecture-Ecole_Polytechnic_Nice_SA-20150127Frank Dawson
 
Public sector breakfast club - October 2017, Exeter
Public sector breakfast club - October 2017, ExeterPublic sector breakfast club - October 2017, Exeter
Public sector breakfast club - October 2017, ExeterBrowne Jacobson LLP
 
Introduction to data protection
Introduction to data protectionIntroduction to data protection
Introduction to data protectionRachel Aldighieri
 
GDPR Breakfast Briefing for Business Advisors
GDPR Breakfast Briefing for Business AdvisorsGDPR Breakfast Briefing for Business Advisors
GDPR Breakfast Briefing for Business AdvisorsHarrison Clark Rickerbys
 
GDPR Breakfast Briefing for Business Owners, IT Directors, HR Directors & Ops...
GDPR Breakfast Briefing for Business Owners, IT Directors, HR Directors & Ops...GDPR Breakfast Briefing for Business Owners, IT Directors, HR Directors & Ops...
GDPR Breakfast Briefing for Business Owners, IT Directors, HR Directors & Ops...Harrison Clark Rickerbys
 
Ready for the GDPR, Ready for the Digital Economy
Ready for the GDPR, Ready for the Digital EconomyReady for the GDPR, Ready for the Digital Economy
Ready for the GDPR, Ready for the Digital EconomyRay ABOU
 
Data Privacy Laws: A Global Overview and Compliance Strategies
Data Privacy Laws: A Global Overview and Compliance StrategiesData Privacy Laws: A Global Overview and Compliance Strategies
Data Privacy Laws: A Global Overview and Compliance StrategiesShyamMishra72
 
Records retention shrm
Records retention shrmRecords retention shrm
Records retention shrmcinderella1961
 
What is the General Data Protection Regulation (GDPR)?
What is the General Data Protection Regulation (GDPR)?What is the General Data Protection Regulation (GDPR)?
What is the General Data Protection Regulation (GDPR)?TAG Alliances
 
Media_644046_smxx (1).pptx
Media_644046_smxx (1).pptxMedia_644046_smxx (1).pptx
Media_644046_smxx (1).pptxMichelleSaver
 
The Summary Guide to Compliance with the Kenya Data Protection Law
The Summary Guide to Compliance with the Kenya Data Protection Law The Summary Guide to Compliance with the Kenya Data Protection Law
The Summary Guide to Compliance with the Kenya Data Protection Law Owako Rodah
 
Privacy Policies: Guide to Protecting User Data
Privacy Policies: Guide to Protecting User DataPrivacy Policies: Guide to Protecting User Data
Privacy Policies: Guide to Protecting User DataPrivacyCenter.cloud
 
Global Data Privacy Regulation
Global Data Privacy RegulationGlobal Data Privacy Regulation
Global Data Privacy RegulationJatin Kochhar
 
Wayne richard - pia risk management - atlseccon2011
Wayne richard - pia risk management - atlseccon2011Wayne richard - pia risk management - atlseccon2011
Wayne richard - pia risk management - atlseccon2011Atlantic Security Conference
 

Similar to IT6701 Information Management Unit - V (20)

Prepare Your Firm for GDPR
Prepare Your Firm for GDPRPrepare Your Firm for GDPR
Prepare Your Firm for GDPR
 
GDPR Breakfast Briefing - For Business Owners, HR Directors, Marketing Direct...
GDPR Breakfast Briefing - For Business Owners, HR Directors, Marketing Direct...GDPR Breakfast Briefing - For Business Owners, HR Directors, Marketing Direct...
GDPR Breakfast Briefing - For Business Owners, HR Directors, Marketing Direct...
 
GDPR Breakfast Briefing for Business Advisors
GDPR Breakfast Briefing for Business AdvisorsGDPR Breakfast Briefing for Business Advisors
GDPR Breakfast Briefing for Business Advisors
 
Gdpr for business full
Gdpr for business fullGdpr for business full
Gdpr for business full
 
Privacy_Engineering_Privacy Assurance_Lecture-Ecole_Polytechnic_Nice_SA-20150127
Privacy_Engineering_Privacy Assurance_Lecture-Ecole_Polytechnic_Nice_SA-20150127Privacy_Engineering_Privacy Assurance_Lecture-Ecole_Polytechnic_Nice_SA-20150127
Privacy_Engineering_Privacy Assurance_Lecture-Ecole_Polytechnic_Nice_SA-20150127
 
Public sector breakfast club - October 2017, Exeter
Public sector breakfast club - October 2017, ExeterPublic sector breakfast club - October 2017, Exeter
Public sector breakfast club - October 2017, Exeter
 
Introduction to data protection
Introduction to data protectionIntroduction to data protection
Introduction to data protection
 
GDPR Breakfast Briefing for Business Advisors
GDPR Breakfast Briefing for Business AdvisorsGDPR Breakfast Briefing for Business Advisors
GDPR Breakfast Briefing for Business Advisors
 
Ppt
PptPpt
Ppt
 
GDPR Breakfast Briefing for Business Owners, IT Directors, HR Directors & Ops...
GDPR Breakfast Briefing for Business Owners, IT Directors, HR Directors & Ops...GDPR Breakfast Briefing for Business Owners, IT Directors, HR Directors & Ops...
GDPR Breakfast Briefing for Business Owners, IT Directors, HR Directors & Ops...
 
GDPR for your Payroll Bureau
GDPR for your Payroll BureauGDPR for your Payroll Bureau
GDPR for your Payroll Bureau
 
Ready for the GDPR, Ready for the Digital Economy
Ready for the GDPR, Ready for the Digital EconomyReady for the GDPR, Ready for the Digital Economy
Ready for the GDPR, Ready for the Digital Economy
 
Data Privacy Laws: A Global Overview and Compliance Strategies
Data Privacy Laws: A Global Overview and Compliance StrategiesData Privacy Laws: A Global Overview and Compliance Strategies
Data Privacy Laws: A Global Overview and Compliance Strategies
 
Records retention shrm
Records retention shrmRecords retention shrm
Records retention shrm
 
What is the General Data Protection Regulation (GDPR)?
What is the General Data Protection Regulation (GDPR)?What is the General Data Protection Regulation (GDPR)?
What is the General Data Protection Regulation (GDPR)?
 
Media_644046_smxx (1).pptx
Media_644046_smxx (1).pptxMedia_644046_smxx (1).pptx
Media_644046_smxx (1).pptx
 
The Summary Guide to Compliance with the Kenya Data Protection Law
The Summary Guide to Compliance with the Kenya Data Protection Law The Summary Guide to Compliance with the Kenya Data Protection Law
The Summary Guide to Compliance with the Kenya Data Protection Law
 
Privacy Policies: Guide to Protecting User Data
Privacy Policies: Guide to Protecting User DataPrivacy Policies: Guide to Protecting User Data
Privacy Policies: Guide to Protecting User Data
 
Global Data Privacy Regulation
Global Data Privacy RegulationGlobal Data Privacy Regulation
Global Data Privacy Regulation
 
Wayne richard - pia risk management - atlseccon2011
Wayne richard - pia risk management - atlseccon2011Wayne richard - pia risk management - atlseccon2011
Wayne richard - pia risk management - atlseccon2011
 

More from pkaviya

IT2255 Web Essentials - Unit V Servlets and Database Connectivity
IT2255 Web Essentials - Unit V Servlets and Database ConnectivityIT2255 Web Essentials - Unit V Servlets and Database Connectivity
IT2255 Web Essentials - Unit V Servlets and Database Connectivitypkaviya
 
IT2255 Web Essentials - Unit IV Server-Side Processing and Scripting - PHP.pdf
IT2255 Web Essentials - Unit IV Server-Side Processing and Scripting - PHP.pdfIT2255 Web Essentials - Unit IV Server-Side Processing and Scripting - PHP.pdf
IT2255 Web Essentials - Unit IV Server-Side Processing and Scripting - PHP.pdfpkaviya
 
IT2255 Web Essentials - Unit III Client-Side Processing and Scripting
IT2255 Web Essentials - Unit III Client-Side Processing and ScriptingIT2255 Web Essentials - Unit III Client-Side Processing and Scripting
IT2255 Web Essentials - Unit III Client-Side Processing and Scriptingpkaviya
 
IT2255 Web Essentials - Unit II Web Designing
IT2255 Web Essentials - Unit II  Web DesigningIT2255 Web Essentials - Unit II  Web Designing
IT2255 Web Essentials - Unit II Web Designingpkaviya
 
IT2255 Web Essentials - Unit I Website Basics
IT2255 Web Essentials - Unit I  Website BasicsIT2255 Web Essentials - Unit I  Website Basics
IT2255 Web Essentials - Unit I Website Basicspkaviya
 
BT2252 - ETBT - UNIT 3 - Enzyme Immobilization.pdf
BT2252 - ETBT - UNIT 3 - Enzyme Immobilization.pdfBT2252 - ETBT - UNIT 3 - Enzyme Immobilization.pdf
BT2252 - ETBT - UNIT 3 - Enzyme Immobilization.pdfpkaviya
 
OIT552 Cloud Computing Material
OIT552 Cloud Computing MaterialOIT552 Cloud Computing Material
OIT552 Cloud Computing Materialpkaviya
 
OIT552 Cloud Computing - Question Bank
OIT552 Cloud Computing - Question BankOIT552 Cloud Computing - Question Bank
OIT552 Cloud Computing - Question Bankpkaviya
 
CS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question BankCS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question Bankpkaviya
 
CS8592 Object Oriented Analysis & Design - UNIT V
CS8592 Object Oriented Analysis & Design - UNIT V CS8592 Object Oriented Analysis & Design - UNIT V
CS8592 Object Oriented Analysis & Design - UNIT V pkaviya
 
CS8592 Object Oriented Analysis & Design - UNIT IV
CS8592 Object Oriented Analysis & Design - UNIT IV CS8592 Object Oriented Analysis & Design - UNIT IV
CS8592 Object Oriented Analysis & Design - UNIT IV pkaviya
 
CS8592 Object Oriented Analysis & Design - UNIT III
CS8592 Object Oriented Analysis & Design - UNIT III CS8592 Object Oriented Analysis & Design - UNIT III
CS8592 Object Oriented Analysis & Design - UNIT III pkaviya
 
CS8592 Object Oriented Analysis & Design - UNIT II
CS8592 Object Oriented Analysis & Design - UNIT IICS8592 Object Oriented Analysis & Design - UNIT II
CS8592 Object Oriented Analysis & Design - UNIT IIpkaviya
 
CS8592 Object Oriented Analysis & Design - UNIT I
CS8592 Object Oriented Analysis & Design - UNIT ICS8592 Object Oriented Analysis & Design - UNIT I
CS8592 Object Oriented Analysis & Design - UNIT Ipkaviya
 
Cs8591 Computer Networks - UNIT V
Cs8591 Computer Networks - UNIT VCs8591 Computer Networks - UNIT V
Cs8591 Computer Networks - UNIT Vpkaviya
 
CS8591 Computer Networks - Unit IV
CS8591 Computer Networks - Unit IVCS8591 Computer Networks - Unit IV
CS8591 Computer Networks - Unit IVpkaviya
 
CS8591 Computer Networks - Unit III
CS8591 Computer Networks - Unit IIICS8591 Computer Networks - Unit III
CS8591 Computer Networks - Unit IIIpkaviya
 
CS8591 Computer Networks - Unit II
CS8591 Computer Networks - Unit II CS8591 Computer Networks - Unit II
CS8591 Computer Networks - Unit II pkaviya
 
CS8591 Computer Networks - Unit I
CS8591 Computer Networks - Unit ICS8591 Computer Networks - Unit I
CS8591 Computer Networks - Unit Ipkaviya
 
IT8602 Mobile Communication - Unit V
IT8602 Mobile Communication - Unit V IT8602 Mobile Communication - Unit V
IT8602 Mobile Communication - Unit V pkaviya
 

More from pkaviya (20)

IT2255 Web Essentials - Unit V Servlets and Database Connectivity
IT2255 Web Essentials - Unit V Servlets and Database ConnectivityIT2255 Web Essentials - Unit V Servlets and Database Connectivity
IT2255 Web Essentials - Unit V Servlets and Database Connectivity
 
IT2255 Web Essentials - Unit IV Server-Side Processing and Scripting - PHP.pdf
IT2255 Web Essentials - Unit IV Server-Side Processing and Scripting - PHP.pdfIT2255 Web Essentials - Unit IV Server-Side Processing and Scripting - PHP.pdf
IT2255 Web Essentials - Unit IV Server-Side Processing and Scripting - PHP.pdf
 
IT2255 Web Essentials - Unit III Client-Side Processing and Scripting
IT2255 Web Essentials - Unit III Client-Side Processing and ScriptingIT2255 Web Essentials - Unit III Client-Side Processing and Scripting
IT2255 Web Essentials - Unit III Client-Side Processing and Scripting
 
IT2255 Web Essentials - Unit II Web Designing
IT2255 Web Essentials - Unit II  Web DesigningIT2255 Web Essentials - Unit II  Web Designing
IT2255 Web Essentials - Unit II Web Designing
 
IT2255 Web Essentials - Unit I Website Basics
IT2255 Web Essentials - Unit I  Website BasicsIT2255 Web Essentials - Unit I  Website Basics
IT2255 Web Essentials - Unit I Website Basics
 
BT2252 - ETBT - UNIT 3 - Enzyme Immobilization.pdf
BT2252 - ETBT - UNIT 3 - Enzyme Immobilization.pdfBT2252 - ETBT - UNIT 3 - Enzyme Immobilization.pdf
BT2252 - ETBT - UNIT 3 - Enzyme Immobilization.pdf
 
OIT552 Cloud Computing Material
OIT552 Cloud Computing MaterialOIT552 Cloud Computing Material
OIT552 Cloud Computing Material
 
OIT552 Cloud Computing - Question Bank
OIT552 Cloud Computing - Question BankOIT552 Cloud Computing - Question Bank
OIT552 Cloud Computing - Question Bank
 
CS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question BankCS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question Bank
 
CS8592 Object Oriented Analysis & Design - UNIT V
CS8592 Object Oriented Analysis & Design - UNIT V CS8592 Object Oriented Analysis & Design - UNIT V
CS8592 Object Oriented Analysis & Design - UNIT V
 
CS8592 Object Oriented Analysis & Design - UNIT IV
CS8592 Object Oriented Analysis & Design - UNIT IV CS8592 Object Oriented Analysis & Design - UNIT IV
CS8592 Object Oriented Analysis & Design - UNIT IV
 
CS8592 Object Oriented Analysis & Design - UNIT III
CS8592 Object Oriented Analysis & Design - UNIT III CS8592 Object Oriented Analysis & Design - UNIT III
CS8592 Object Oriented Analysis & Design - UNIT III
 
CS8592 Object Oriented Analysis & Design - UNIT II
CS8592 Object Oriented Analysis & Design - UNIT IICS8592 Object Oriented Analysis & Design - UNIT II
CS8592 Object Oriented Analysis & Design - UNIT II
 
CS8592 Object Oriented Analysis & Design - UNIT I
CS8592 Object Oriented Analysis & Design - UNIT ICS8592 Object Oriented Analysis & Design - UNIT I
CS8592 Object Oriented Analysis & Design - UNIT I
 
Cs8591 Computer Networks - UNIT V
Cs8591 Computer Networks - UNIT VCs8591 Computer Networks - UNIT V
Cs8591 Computer Networks - UNIT V
 
CS8591 Computer Networks - Unit IV
CS8591 Computer Networks - Unit IVCS8591 Computer Networks - Unit IV
CS8591 Computer Networks - Unit IV
 
CS8591 Computer Networks - Unit III
CS8591 Computer Networks - Unit IIICS8591 Computer Networks - Unit III
CS8591 Computer Networks - Unit III
 
CS8591 Computer Networks - Unit II
CS8591 Computer Networks - Unit II CS8591 Computer Networks - Unit II
CS8591 Computer Networks - Unit II
 
CS8591 Computer Networks - Unit I
CS8591 Computer Networks - Unit ICS8591 Computer Networks - Unit I
CS8591 Computer Networks - Unit I
 
IT8602 Mobile Communication - Unit V
IT8602 Mobile Communication - Unit V IT8602 Mobile Communication - Unit V
IT8602 Mobile Communication - Unit V
 

Recently uploaded

FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 

Recently uploaded (20)

FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 

IT6701 Information Management Unit - V

  • 1. IT6701 – Information Management Unit V – Information Lifecycle Management By Kaviya.P, AP/IT Kamaraj College of Engineering & Technology 1
  • 2. Unit V – Information Lifecycle Management Data retention policies; Confidential and Sensitive data handling, lifecycle management costs. Archive data using Hadoop; Testing and delivering big data applications for performance and functionality; Challenges with data administration 2
  • 3. Data Retention Policies What is Data Retention Policies? • A document retention policy provides for the systematic review, retention and destruction of documents received or created in the course of business. • A document retention policy will identify documents that need to be maintained and contain guidelines for how long certain documents should be kept and how they should be destroyed. Purpose of Data Retention Policies • To maintain important records and documents for future use or reference. • To dispose of records or documents that are no longer needed. • To organize records so that they can be searched and accessed easily at a later date. 3
  • 4. Data Retention Policies Categories of Requirements • Legal or Legitimate requirements: The compliance or legal aspect, where a certain legal case is filed and some piece of information need to be produced in a court of law. • Business or Commercial requirements: To make information available from the operation’s perspective. • Personal or Private requirements: To make information available from the personal perspective. 4
  • 5. Data Retention Policies Scope : Categories of Document (What documents must be protected?) • Legal Records: It include all the legal records, contracts, trademark, power of attorney, press release, etc. These are the first set of documents that should be considered for retention. • Final Records: Documents not requiring ad hoc modification or alteration. They can also specify records of completed activities. • Permanent Records: Include all the business documents that describe the organization’s details. They can also comprise of contracts, financial registers, copyrights, patents, proposals. • Accounting and Corporate Tax Records: Consists of financial statements, investments, audits, tax returns, purchase, sales records, etc. 5
  • 6. Data Retention Policies Scope : Categories of Document (What documents must be protected?) • Workplace Records: Information about the day-to-day activities of employees, agreements, minutes of meetings, bylaws, etc. • Employment, Employee, and Payroll Records: Include job postings, job advertisements, recruitment procedures, performance reviews, etc. • Bank Records: Information about bank transactions, deposits, cheque details, stop payment, check bouncing. • Historic Records: Records that are no longer required by the organization. • Temporary Records: Documents that are not completed or finalized. 6
  • 7. Data Retention Policies Data Retention Policy • When developing a retention policy, it is important to focus on the reason behind data retention. • The decision is based creation date, and include other criteria such as last access time, type of data, time till which data is valid, data value, etc. • The policy document should include details of the data/document that needs to be retained. • The data should be divided into various categories such as personal employee data, client data, financial data, legal data, etc. • This division would help in deciding the duration of retention and destruction procedures. • When the data retention period is over, the data should be discarded. 7
  • 8. Data Retention Policies Why to have Data Retention Policies? The policy is also helpful to: • Provide a system for complying with document retention laws • Ensure that valuable documents are available when needed • Save money, space and time • Protect against allegations of selective document destruction, and • Provide for the routine destruction of non-business, superfluous and outdated documents 8
  • 9. Data Retention Policies Why to have Data Retention Policies? The six most important reasons why an organization should implement a document retention policy are: 1. To comply with legal duties and requirements, either statutory or regulatory 2. To avoid liability through “spoliation” the improper destruction or alteration of documents in a litigation situation 3. To support or oppose a position in an investigation or litigation 4. To protect from unnecessary expense and time during discovery 5. To maintain control over discovery and e-discovery, and 6. To keep documents confidential and avoid leakage to attackers or competitors 9
  • 10. Data Retention Policies Laws Related to Data Retention Policy - India • In India there is no Central Act which laid down the provisions related to Data Retention Laws. • But there are different policies incorporated by various agencies and which maintain and follows their policies. • Eg 1: Government of India Central Vigilance Commission by their wide notification no. No.17/09/2006-Admn. gives the provisions related to Retention period/destruction schedule of recorded files. • Eg 2: The Ministry of Finance - Financial intelligence Unit has its own policy. Notification No. 9/2005 - gives the “rules for Record Keeping and Reporting”. 10
  • 11. Data Retention Policies Laws Related to Data Retention Policy - India • Rule 6. Retention of records - The records referred to in rule 3 shall be maintained for a period of ten years from the date of cessation of the transactions between the client and the banking company, financial institution or intermediary, as the case may be. • Thus, it may be noted that organization has its own Data retention Policies and certain rules for retention of such records. • However, there is no such established law wherein it is binding for the organizations to prepare such policies. 11
  • 12. Confidential and Sensitive Data Handling Definition of Sensitive Data • Data collected may be personal, confidential or sensitive in nature. • Personal data provides information about an individual, and through which an individual can be easily and uniquely identified, either directly or indirectly. • Confidential data is the personal data that is private and should not be disclosed to others. 12
  • 13. Confidential and Sensitive Data Handling Types of Sensitive Data • Personal Information – Sensitive personally identifiable information is data that can be traced back to an individual, thus revealing one’s identity. – Such information includes biometric data, medial information and history, bank and credit card information, Passport or Aadhar numbers. – Threats include not only crimes such as identity theft, but also disclosure of personal information that the individual would prefer reminded private. – Sensitive data should be encrypted both in transit and at rest. 13
  • 14. Confidential and Sensitive Data Handling Types of Sensitive Data • Business Information – Sensitive business information includes everything that poses a risk to the company in question if discovered by a competitor or the general public. – Such information includes trade secrets, contract details, acquisition plans, financial data, supplier details, customer information. – Methods of protecting corporate information from unauthorized access are becoming integral to corporate security. – These methods include deciding policy for security, metadata management and document sanitization. 14
  • 15. Confidential and Sensitive Data Handling Types of Sensitive Data • Classified Information – It is pertains to a government body and is restricted according to the level of sensitivity. (Eg: restricted, confidential, secret, and top secret) – Information is generally classified to protect security. – Once the risk of harm has passed or decreased, classified information may be declassified and, possibly, made public. 15
  • 16. Confidential and Sensitive Data Handling Handling of Sensitive Data • Sensitive data needs to be handled with utmost care with highest possible security measures. • Given a dataset, one or more attribute values in the tuple/record can be sensitive and hence needs to be protected. But at the same time, other attributes of the same tuple/record can be made available. • Thus, the access policy needs to be defined at different granularity levels so that access of these values for the attributes can be made available. • Eg: If a query is triggered seeking information of all the patients having certain health records, it should not reveal the identity of the individuals. Instead some aggregate function can be applied like giving the total number of count of patients suffering from the health condition. 16
  • 17. Confidential and Sensitive Data Handling Access Decision • The database administrator decides what data should be in the database and who should have access to it. • These decisions are based on access policies that are defined in the organization. • Multiple factors are considered in making these polices such as availability of data, acceptability of the access, authenticity of the user, etc. 17
  • 18. Confidential and Sensitive Data Handling Types of Disclosures Sensitive data can be also be characterized based on what values are being disclosed. • Displaying exact data: This is the most serious disclosure where the user will directly get the sensitive data on request or sometimes without request; the latter being a serious security concern. • Displaying Bounds: Bounds are a convenient way of presenting sensitive data, indicating that the sensitive value lies between high or low value. Eg: An organization can reveal the range of salaries given to its managers, such that any person willing to join the organization can take decision based on it. 18
  • 19. Confidential and Sensitive Data Handling Types of Disclosures Sensitive data can be also be characterized based on what values are being disclosed. • Displaying negative results: Sometimes a query could display a negative result, specifying that a particular value is not present. This is of particular importance if the data is of binary type and is represented as 0 or 1. Thus disclosing a value 0 is of significant importance. However, in certain cases displaying information like whether a student will appear in the top 10 list would not reveal significant information. • Displaying probable values: Sometimes it maybe be possible to determine the probability that a certain attribute will hold a particular value. • Sensitive data can be secured by keeping it in an encrypted format so that the information is not accidently revealed. But this can be tedious sometimes, if different attributes need different levels of confidentiality. 19
  • 20. Confidential and Sensitive Data Handling Handling Data 1. Create a risk aware culture that includes an information security risk management program. Define security and risk mitigation and handling policies at the enterprise level. 2. Define data types used in the organization and classify it as confidential or sensitive. 3. Clarify responsibilities and accountability for the protection of confidential/sensitive data. 4. Limit the access to confidential/sensitive data only to those absolutely essential to institutional process. 5. Provide awareness and training to properly use the resources and follow the guidelines and rules specified. 6. Authenticate compliance regularly with your policies and procedures. 20
  • 21. Confidential and Sensitive Data Handling Law provision in India Defining Sensitive Data and its Handling Right to Information Act, 2005 gave a stimulus to transparency in government dealings and concurrently provided some protection against the unwarranted disclosure of confidential information under the law. • A new civil provision prescribing damages for an entity that is negligent in using “reasonable security practices and procedures” while handling “sensitive personal or data information” resulting in wrongful loss or wrongful gain to any person. • Criminal punishment for a person (a) if s/he discloses sensitive personal information; (b) does so without the consent of the person or in breach of relevant contact and (c) with an intention of ,or knowing that the disclosure would cause wrongful loss or gain. • The IT rules introduced in 2011, defines “sensitive personal data” for the first time in India. 21
  • 22. Confidential and Sensitive Data Handling Law provision in India Defining Sensitive Data and its Handling The salient features of the new rules are as follows: • Sensitive personal information: The laws relate to dealing with information generally, personal information and “sensitive personal or data information”(SPD). SPD is defined to cover the following : (a)passwords,(b)financial and credit information such as bank account or credit card or debit card or other payment instrument details;(c)physical, physiological and mental conditions ;(d) sexual orientation; (e) medical records and history and (f) biometric and deoxyribonucleic acid(DNA) information. It may be noted that SPD deals with information of individuals and not information of business. • Privacy policy: Every business needs to have a privacy policy that must be published on its website. Even if the business is not handling SPD, it is required to have a privacy policy. It must describe what information is collected, what is the purpose of using the information, to whom or how the information might be disclosed and the sound security practices followed to safeguard the information. 22
  • 23. Confidential and Sensitive Data Handling Law provision in India Defining Sensitive Data and its Handling The salient features of the new rules are as follows: • Consent for collection: A business cannot collect SPD unless it obtains the prior consent of the Information provider. The consent has to be provided by letter, fax or email. • Notification: The business should ensure that the information provider is aware of the information being collected, the purpose of using the information, the recipients of the information and the name and address of the agency collecting the information. • Use and Retention: The usage of personal information has to be restricted to the purpose for which it was collected. The data retention rules have to be followed in terms of maintaining the data for specified period as well as destroying the data after that. The business should not maintain the SPD for longer than it is specified. 23
  • 24. Confidential and Sensitive Data Handling Law provision in India Defining Sensitive Data and its Handling The salient features of the new rules are as follows: • Rights of access, correction and withdrawal: The business should permit the information provider the right to review the information, and should ensure that any information found to be inaccurate or deficient be corrected. The information provider also has the right to withdraw its consent to the collection and use of the information • Transnational transfer: A business can only transfer the SPD or information to a party overseas if the overseas party ensures the same level of protection provided for under the Indian rules. • Security procedures: The IT Act requires reasonable security procedures to be maintained to escape liability. The security procedure has to be audited on a regular basis by an independent auditor, approved by the Government of India. 24
  • 25. Lifecycle Management Costs • Data Lifecycle Management is the process of handling the flow of business information throughout its lifespan, from requirements through maintenance. • Information Lifecycle Management (ILM) is the consistent management of information from creation to final disposition. • It is comprised of strategy, process, and technology to effectively manage information which, when combined, drives improved control over information in the enterprise. • It aims at automating the processes involved in organizing data into separate tiers according to the specified policies, and automating data migration from one tier to another tier. • As a rule, newer data, and data that must be accessed more frequently, is stored on faster, but more expensive storage media, while less critical data is stored on cheaper, but slower media. 25
  • 26. Lifecycle Management Costs Benefits of Information Management Lifecycle • Reduced Risk: Reduce unneeded and expired information, and make your information easier to manage and discover. • Cost Saving: eDiscovery, storage, and legal hold costs can be reduced with better management of information. • Improved Service: Archiving, eDiscovery, and Records Management may become less of a distraction and drain on IT and Legal. • Effective Governance: ILM can introduce management rigor and controls that benefit the enterprise. ILM can bring the added bonus of improved management of information for the entire business. 26
  • 27. Lifecycle Management Costs Five Stages of Data Lifecycle • Data Creation – When an employee or client creates and saves a file, that data becomes a part of the organization’s daily operation. – Enterprises often store this active data locally and on a network server while backing it up on local storage appliances or cloud storage. – This setup provides for fast recovery in case of data loss. • Backup storage against data loss – As the system’s efficiency increases, the enterprise can replicate the data from primary storage into less costly off-site tape vaults or to the cloud. – In case of a major outage or disaster, the data can be restored completely. – The backup of the data and the amount of replication depends on the type and value of the data. 27
  • 28. Lifecycle Management Costs Five Stages of Data Lifecycle • Archiving helps contain storage costs – Older inactive data that is not frequently handled can be retained in case of a legal, regulatory or audit event. – Various data storage networks can be used to archive the data, or data can be retained using cloud or Hadoop. – Offsite tapes offer high security, quick access, lower storage costs for such long-term data storage demands. – This kind of low-cost tape is particularly well suited to unstructured data such as Email. • Ensuring secure data destruction – The final stage of data lifecycle requires secure data destruction, which is typically governed by a schedule that defines when and how you must destroy unwanted data. – Once data reaches its expiration date, secure media destruction can ensure its environmentally friendly disposal. 28
  • 29. Lifecycle Management Costs Five Stages of Data Lifecycle • Put secure IT asset disposition to work – The data storage lifecycle does not end until the last traces of data are destroyed –and this includes information remaining within any obsolete hardware or peripherals. – As with media destruction, maintain the chain of custody when eliminating any old computers and office equipment. Efficient Information Lifecycle Management • For handling large amount of data, the storage needs to be scalable to accommodate it. Hence, a flexible architecture should be considered for storage. • Analytics application in some cases require us to access archived and unstructured data. To leverage analytics, to make informed decision data can be archived into frameworks like Hadoop. • The storage can be optimized for maintenance and licensing costs by migrating rarely used data into framework like Hadoop. 29
  • 30. Lifecycle Management Costs To proficiently manage data throughout its entire lifecycle, organizations must keep three objectives in mind: • Data veracity(trustworthiness) is critical for both analytics and regulatory compliance. • Both structured and unstructured data must be managed effectively. • Data privacy and security must be protected at all times. 30
  • 31. Archive Data Using Hadoop • The inexpensive cost of storage for Hadoop which supports to store any type of data like structured , semi-structured or unstructured data plus the ability to query Hadoop data using SQL commands. • Hadoop utilizes commodity hardware and can be easily scaled up to accommodate new data. • Thus, the Hadoop environment can be used to archive and process the data. • The Hadoop used to perform archiving is Sqoop, which can move the data to be archived from the data warehouse into Hadoop. • You will need to consider what form you want the data to take in your Hadoop cluster. In general, compressed Hive files are a good option. 31
  • 32. Archive Data Using Hadoop • Archiving everything has an advantage of providing a single interface across the entire dataset for issuing queries. • Partial availability of data would require queries to be executed on the archived data and the active data, and provide a merged solution of the two queries. • An enterprise data warehouse archiving solution for Hadoop must provide three key features: – Schema conversation: The archive must precisely duplicate the schema of the source warehouse. It is essential to confirm that data values will be archived without loss of precision. Changes to the source schema, for example, adding new columns or changing data types, should also be captured by the archive. – Control and security: The archive must provide access to data on a “need to know” basis; it must guarantee that sensitive data is encrypted or masked, and that access is audited. – Querying support: Support for SQL access to the archived data is essential. Applications would require us to make use of the archived data to generate reports or to perform analysis. 32
  • 33. Testing and Delivering Big Data Applications for Performance and Functionality • Testing bid data application is more a verification of its data processing rather than testing the individual features of the software product. • When it comes to big data testing, performance and functional testing are the key components to evaluate. • The testing of Hadoop big data application can be performed as a two-step process. – Checking the functionality: The business logic encoded using MapReduce programs is tested in this phase. For this, unit testing can be performed and executed in the pseudo-distributed mode. – Checking on the cluster: Once the business logic is validated, it can be tested on the cluster for the performance and failover. Performance testing includes testing of job completion and the time taken, utilization of the memory and other resources, data throughput, etc. Failover testing included failure of one or more daemons running in Hadoop, namely, NameNode, DataNode, Resource Manager, Node Manager or failure of the device through which the distributed environment is made available. 33
  • 34. Testing and Delivering Big Data Applications for Performance and Functionality Testing big data applications have several challenges, which include the following: • Automation: Support of automation tools for performing testing is not available. Thus, automation in testing for big data requires someone with technical expertise. Also, automated tools are not equipped to handle unexpected problems that arise during testing. • Virtualization: Testing, especially unit testing, is usually performed in a virtual environment. It is one of the fundamental phases of testing. Virtual machine latency creates timing problems in real time big data testing. Also, managing images in big data is a hassle. • Large dataset: The amount of data is huge and can have many variations. Further they can originate from different sources, thus integrating data is a major challenge. Thus, more data needs to be verified and this needs to be done at faster rate. • Testing across platforms: Hadoop is a collection various tools. The applications can be written using any of the tools. Thus, there is a need of tools that will enable testing across different platforms. • Monitoring and diagnostic solution: There are limited solutions that can monitor the entire execution environment and detect bottleneck or failures. 34
  • 35. Challenges with Data Administration • The Data administrator is responsible for designing and maintaining data stores. • Data administration is the method by which data is monitored, managed and maintained by a person or an organisation. • Data administration allows an organisation to check its data resources, along with their processing and communications with different applications and business processes. • Data Administrator needs to integrate data from multiple resources and provide it to various applications. • Data administrator deals with designing of the logical and conceptual models treating the data at an organisational level whereas Database administrator deal with implementation of databases required and in use. 35
  • 36. Challenges with Data Administration Responsibility of Data Administrator 1. Data Policies, Procedures, Standards • Data administrator should set the data creation and handling policies which include details of which application can interact with which data, how that data can be changed and what is the effect of the change. • Data Procedures are documented plan of actions to be taken to perform a certain activity like backup and recovery procedures. Data administrator’s role is to ensure that these procedures are defined and communicated to all concerned employees. • Data Standards are unambiguous conventions and behaviours that need to be followed so that the maintenance becomes easy. It can also be used to evaluate database quality. 2. Planning • Effective administration of data requires an understanding of the organisations needs and the ability to lead the development of an information architecture that will meet the diverse needs of the organisation. • Thus a data administrator needs to plan for an effective administration of data and also provide support for future needs. 36
  • 37. Challenges with Data Administration Responsibility of Data Administrator 3. Data Conflict(ownership) Resolution • Data stores are planned to be shared and usually involve data from several different departments of the organisation. • Ownership of data in a sensitive issue in every organisation. • Data administrator should establish procedures for resolving any conflicts in ownership. 4. Managing the Data Repository • Data Repositories contain metadata that holds data description of the data stored in data stores. • They describe an organisations data and data processing resources. • As the data stores are increasing in size and incorporating unstructured data, data repositories need to be enhanced to incorporate new and unseen data. 5. Internal Marketing of DA Concepts • For data administration to be effective, established policies and procedures must be made known to the internal staff. These may reduce resistance to changes or ownership problems. 37
  • 38. Challenges with Data Administration Responsibility of Data Administrator 1. Designing the Database • The administrator is responsible for defining and creating the logical data model, physical database model and prototyping. 2. Security and Authorization • The database administrator ensures that there is no unauthorized access to data. In general, the data should not be accessible to everyone. • In a database system, user may be granted permission to access only certain views and relations. • The administrator can enforce various authentication and authorization techniques through which the access can be guaranteed only to specific entities. • Authentication techniques will ensure that the person is an individual who is supposed to access the data while authorization techniques decide what data has to be given access to. 38
  • 39. Challenges with Data Administration Responsibility of Data Administrator 3. Data Availability and Recovery from Failures • The administrator makes sure that the data is available at all times. • In case of database failure, the administrator should ensure that the data is made available to its user in such a way that the users are unaware of the failure. • The administrator also ensures that the data remains in a consistent state and appropriate techniques to achieve these are implemented. 4. Database Tuning • Data needs to be evolved with time as the users need change. • The administrator should modify the structure or design of the database to incorporate these changes. • The DBA is responsible for modifying the database in particular the conceptual and logical design. 39
  • 40. Challenges with Data Administration Challenges of Data Administrator • Creating the Data Repository – With huge amount of data flowing in from various sources, integrating it to create a common data repository is challenging. – This is further complicated since the data is in an unstructured format. – Pre-processing is an important step in preparing the data for processing and efficient techniques need to be developed. • Evolving Nature of Data Consideration in Analysis – A modern administrator is required to have an understanding of the vast domains as organizations are now dealing with new types of data. – Eg: A machine data is centrally logged and stored. For tracking the machines performance its data needs to be understood well enough to gain insight from it even if they do not possess the relevant technical background. 40
  • 41. Challenges with Data Administration Challenges of Data Administrator • Emphasize the capability to build a database quickly, tune it for maximum performance and restore it to production quickly when problems develop. • Enforcing the data policies and standards especially those related to security. • As the organizations needs are changing, efficient support should be provided to incorporate the changes and make provision for future scope. • Ownership criteria of the data in not restricted to the internal staff. With the social media, it is tricky to define the ownership of data. • The administrator is always expected to keep abreast with new technologies and is usually involved in mission critical applications. • Another challenging aspect is that data administrators are required to have a comprehensive understanding of a wide variety of topics to understand and improve business processes in their organization. 41