Unmanaged data poses risks to organizations, such as unauthorized access, inability to meet regulatory requirements, and increased storage costs. To address these risks, organizations should take steps to gain control over their unmanaged data, including reducing data loss, gaining visibility into information assets, defining governance policies, applying controls, and enabling information discovery. Taking these steps will help organizations effectively manage data growth and risks while increasing the value of their information assets.
2. What is Unmanaged Data?
• All organisations have experienced a growth in data.
• Not just inside the perimeters of the organisation.
• Doesn’t fit into traditional relational systems.
Mark D. Nicholls - November
2012 Not Protectively Marked
3. Risks of Unmanaged Data
• Who has access to our data?
• Do we meet our regulatory requirements?
• If a SAR was made could we respond?
• Increased Storage costs.
Mark D. Nicholls - November
2012 Not Protectively Marked
4. Steps to getting data back to health
• Step 1: Reduce data loss from the enterprise perimeter
• Step 2: Gain visibility of your information assets
• Step 3: Define information governance policies
• Step 4: Apply controls
• Step 5: Enable information discovery
Mark D. Nicholls - November
2012 Not Protectively Marked
5. Reduce data loss from the
enterprise perimeter
• Cost of a Data Breach puts the average cost to
an enterprise at £1.75m in the UK
Mark D. Nicholls - November
2012 Not Protectively Marked
6. Gain visibility of your information assets
• What will make the Hackers rich?
• What could damage the reputation of the organisation?
• What needs to be kept for regulatory compliance?
Mark D. Nicholls - November
2012 Not Protectively Marked
7. Define information governance policies
• Information Life Cycle Management Policy
Mark D. Nicholls - November
2012 Not Protectively Marked
8. Apply controls
• People – Educate, Train and make aware of risks.
• Processes – Change, Manage and automate.
• Technology – Aligned and appropriate.
Mark D. Nicholls - November
2012 Not Protectively Marked
9. Enable information discovery
• Freedom of Information Requests
• Data Protection Subject Access Requests
• Employee related legal requests - Tribunals
Mark D. Nicholls - November
2012 Not Protectively Marked
10. Conclusion
• Unmanaged Data Growth has become
uncontrollable for many organisations
• Using a governance framework can effectively
restore health reducing risk and increasing value
of the information assets
• Find the balance between usability and security
Mark D. Nicholls - November
2012 Not Protectively Marked
11. Thank you & Questions
Mark D. Nicholls - November
2012 Not Protectively Marked
Editor's Notes
For many enterprises, managing the growing volumes of data is becoming a major challenge. Data centre storage volumes continue to grow in excess of 40% per annum. And this is only the tip of the iceberg: ever more enterprise data is now found not only outside of the datacentre, but outside of the physical boundaries of the enterprise - on laptops, tablets and smartphones - as well as in the Cloud. IDC’s newest estimate says that in 2011 there was 1.8 zettabytes of digital data (created and replicated) in the world, growing to 7.9 zettabytes by 2015. So the question is really where is all this data coming from? How are we creating, replicating, saving, mining, and analyzing such colossal amounts of data? There is a veritable plethora of information sites detailing the statistics on Internet usage, digital data growth, with special consideration for social networking. Some statistics from 2010 and 2011 include: Twitter has 200 million tweets per day or approximately 46MB/sec of data created (August 2011) Facebook has 640 million users, with 50% logging in daily (March 2011) LinkedIn has over 100 million users (mid-2011) The largest Yahoo! Hadoop cluster is 82PB, and over 40,000 servers are running its operations (June 2011) Facebook collects an average of 15TB of data every day or 5000+ TB per year, and has more than 30PB in one cluster (March 2011) 107 trillion emails were sent in 2010 There were 152 million blogs in 2010 Goggle has more than 50 billion pages in its index (December 2011) YouTube has 3 billion visitors per day, 48 hours of video is uploaded per minute (May 2011) Amazon’s S3 cloud service had some 262 billion objects at the end of 2010, with approximately 200,000 requests per second. Includes emails, word processing documents, multimedia, video, PDF files, spreadsheets, messaging content, digital pictures and graphics, mobile phone GPS records, and social media content
With the proliferation of unstructured data both inside and outside the data centre, organisations have much less control over their information. They cannot be sure who has access to their information assets and they are struggling to meet their regulatory obligations. For many organisations, the increasing volumes of data have become a liability, it is costly to store and manage, and costly to search in the event of litigation. 1. Personal data shall be processed fairly and lawfully and, in particular, shall not be processed unless – (a) at least one of the conditions in Schedule 2 is met, and (b) in the case of sensitive personal data, at least one of the conditions in Schedule 3 is also met. 2. Personal data shall be obtained only for one or more specified and lawful purposes, and shall not be further processed in any manner incompatible with that purpose or those purposes. 3. Personal data shall be adequate, relevant and not excessive in relation to the purpose or purposes for which they are processed. 4. Personal data shall be accurate and, where necessary, kept up to date. 5. Personal data processed for any purpose or purposes shall not be kept for longer than is necessary for that purpose or those purposes. 6. Personal data shall be processed in accordance with the rights of data subjects under this Act. 7. Appropriate technical and organisational measures shall be taken against unauthorised or unlawful processing of personal data and against accidental loss or destruction of, or damage to, personal data. 8. Personal data shall not be transferred to a country or territory outside the European Economic Area unless that country or territory ensures an adequate level of protection for the rights and freedoms of data subjects in relation to the processing of personal data. In brief – what is an individual entitled to? This right, commonly referred to as subject access, is created by section 7 of the Data Protection Act. It is most often used by individuals who want to see a copy of the information an organisation holds about them. However, the right of access goes further than this, and an individual who makes a written request and pays a fee is entitled to be: told whether any personal data is being processed; given a description of the personal data, the reasons it is being processed, and whether it will be given to any other organisations or people; given a copy of the information comprising the data; and given details of the source of the data (where this is available). An individual can also request information about the reasoning behind any automated decisions, such as a computer-generated decision to grant or deny credit, or an assessment of performance at work (except where this information is a trade secret). Other rights relating to these types of decisions are dealt with in more detail in the section about rights relating to automatic decision taking . In most cases you must respond to a subject access request promptly and in any event within 40 calendar days of receiving it.
For many organisations, the greatest information governance fear is that critical information will leak out into hostile hands. The consequences of data loss can be catastrophic: Symantec and Ponemon Institute’s report on the Cost of a Data Breach puts the average cost to an enterprise at £1.75m in the UK and $5.5m in the US. Imagine you are running a bath and the water is your data/information. You left the plug out so the water is leaking out. You could turn the tap off but you will still lose information/data. To stop the leak you need to put the appropriate technologies in place for example, Firewalls, IPS, Data loss prevention tools. Implement controls that help prevent data loss from their perimeter. Typically, these controls target egress points such as email, web (including web email, IM and cloud services), portable USB devices and mobile devices. Content-based data loss prevention technology can be configured to detect and block common critical information types, such as PCI data, customer information and HR records. It can even help protect intellectual property, such as source code or design blueprints. University of York in student data breach on website Information relating to 148 students at York University was published online Continue reading the main story Related Stories An investigation has begun at the University of York after personal data of 148 students was published. Information including the students' mobile phone numbers, addresses and A-level results was made available. The information could be accessed on a student inquiry page on the university's website. In a statement, the university said it had "taken immediate action to rectify this problem" and had apologised to all those students affected. Legal action Following the breach of data last week, the university said a review of its security systems was under way. The statement, signed by Registrar Dr David Duncan, said: "We are also investigating all procedures and management systems and will undertake a thorough review of our data security arrangements. "The Information Commissioner has been informed. "I would like to apologise to everyone who has been affected by this breach." Tim Ngwena, president of the university's student union, said: "Obviously students are quite concerned because you trust the information that you provide, when you apply to any institution, to be held safely much like anyone would expect when applying for any job." If found to have violated the 1998 Data Protection Act, the university could face a fine or legal action.
One of the biggest challenges organisations face is determining just what information they have, how critical it is and where it is located. Gaining visibility of information assets is a major step in information governance programmes. Much of the work is people-intensive and involves meeting with the various business units and functional teams, to determine the different types of information they handle and its criticality to the business. However, finding out exactly where the various information types are stored – particularly in unstructured data stores – can be very difficult. There are 3 key questions to ask. what will make the Hackers rich? This information/data that could be sold on the black market, could be intellectual property, could be corporate data etc... What could damage the reputation of the organisation? Anything that gets into the press around losing customer data is likely going to be bad news for an organisation and could result in customers going to competitors. What needs to be kept for regulatory compliance? In this area financial regulations and data protection act are areas where data needs to be kept for compliance. One effective way to begin is to scan the various unstructured data stores and analyse access patterns. By mapping files to the business units or functional teams that access them most frequently, you can determine the likely criticality of the data held within them. Content-scanning techniques can then be applied to identify specific information assets, focusing first on those data stores thought likely to hold the most critical data.
These policies should include the rules for who has access to what information types, where the information is stored, how long it is retained and when it should be deleted. Protective marking. Policies should also ensure compliance with applicable legal and industry regulations, and protect from undue risk, but at the same time, enable the information to be used effectively to drive business value. When the organisation stops storing irrelevant data this in turn reduces storage costs.
It is also vital to implement controls that will ensure policy compliance. Many of these controls will be process-based, others will be technical. Technical controls often include data de-duplication, compression, replication and backup, data encryption (at rest, in motion, or both) and archiving, automated data disposal and content-based data loss prevention. One of most important controls though is people; ensure they are educated, trained and aware of the risks associated with information/data handling. Most organisations take a risk-based approach to implementing these technical controls, prioritising their most critical or sensitive data types first and focusing initially on those controls that will deliver the biggest benefit to the business, in terms of risk and / or cost reduction.
The final step is to implement tools that allow data stores to be easily searched in the event of a requirement to discover information when faced with legal action. Many enterprises face the daunting task of identifying all the electronic data related to the case and this can be a hugely labour-intensive task, sometimes involving dozens of lawyers, IT professionals and business unit representatives. Effective eDiscovery tools can dramatically reduce the amount of manual effort involved. Based on experience, the time required to complete discovery exercises can be reduced by a factor of ten and the financial savings can run into millions of pounds.
Unmanaged data growth has become endemic in many enterprises. However the impact of failing to govern information effectively - in terms of increased risks, increases costs and an inability to effectively exploit the business value of the enterprise’s information assets – is increasing. This five-step approach to information governance provides an effective framework for gaining control over your enterprise data stores and will enable a reduction in your information risk, while at the same to gaining full value from your corporate information assets.