Identity vs Reputation

What You Will Learn
This paper covers individual identities on the WWW and how tracking users’ in...
Why Privacy is Important / Why it Matters
The rise of web based applications has made it easy for companies to determine i...
Historical Mistakes                                                    a solution that will maximize profits through the l...
Privacy Laws and                              Central American nations tend to take a
It is possible for the industry to meet all of these regulations, the desires of a company, and the needs
back to the profile from uniquity. To a user, this has the least value. If someone gets a copy of the uniquity,
the best t...
Upcoming SlideShare
Loading in …5

Identity Vs Reputation


Published on

A CMSG white paper on user privacy

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Identity Vs Reputation

  1. 1. Identity vs Reputation What You Will Learn This paper covers individual identities on the WWW and how tracking users’ interactions can improve their experience without sacrificing their privacy. Introduction Web users have become increasingly savvy about protecting their identity and privacy. At the same time, web site operators have become savvy about amassing large amounts of customer data and finding trends to customize user experiences and offerings. To be successful at this, web site operators need to respect the privacy needs of their users while collecting the information they need to improve their business. Violation of users’ privacy can result in the loss of the customer as well as government intervention. This paper provides one approach to meeting these seemingly conflicting goals. Definitions Defining the terms around identity and privacy are of critical importance. Online Identity: A person’s distinct individual online persona. It usually doesn’t include any Personally Identifiable Information (PII) but consists of a shell – including a nickname and an avatar. Authentication: The process in which a person’s identity is confirmed online using a verifiable source to admit them into an online community or website. Verifiable Source: A verifiable source may be as simple as providing an email address, or may be as significant as providing a credit card number. Authorization: The process by which a person becomes approved to enter a website or program, usually with a user name and/or password. Personally Identifiable Information (PII): A term used in privacy and legal fields that refers to any information that can identify a person as a specific individual, such as name, postal or email address, phone number, occupation, or personal interests. It does not include web pages viewed or links clicked on, web search terms, time spent on a site, response to advertisements, or system settings such as the browser used, speed of connection and zip code. Sensitive Personal Information (SPI): Any information that would permit access to a person’s financial account, including account number, credit or debit card number, in combination with any required security code, access code or password.
  2. 2. Why Privacy is Important / Why it Matters The rise of web based applications has made it easy for companies to determine information about their customers – ranging from their basic demographics to their personal preferences. While this information can be gathered explicitly through surveys, and forms, it can also be inferred through the user’s actions. The results may benefit the user in the end, but the method may make them uncomfortable and cause them to leave the website. The ultimate challenge is balancing the needs of both parties. At the end of the day though, privacy is measured by the end consumer’s reaction to their experience. What The User Reveals In order to meet the requirements of the Children’s Online Privacy Protection Act (COPPA) of 1998, users may be asked to enter their birthday to verify their eligibility to access certain content. Users are comfortable revealing this and other basic demographics in order to access many of their favorite sites. They do however make a conscious decision to limit what they reveal on a site to what they feel is necessary for the experience. When prompted for information that a user feels is unnecessary, they will typically provide incorrect information about such things as their birthday or gender. At the same time, when it comes to social networking sites such as LinkedIn and Facebook, there is a social norm which causes people to reveal much more accurate information. When there are personal relationships involved, people feel compelled to provide their real birth date or gender information. When pictures can be uploaded, the accuracy of the basic information increases even more since deceptions are more likely to be uncovered. Beyond the basic information, the accuracy of what users reveal about themselves is much more impacted by social status and peer pressure than anything else. Stereotypes can be readily found in individual profiles: for example, men expressing interest in action movies and sports, college students talking about parties, and women liking romantic movies. Another area that causes concern for individuals is what they reveal from a financial perspective. As a result, they often provide false information. Beyond the basic PII information, users may misrepresent their financial status to boost their self-esteem or to assert themselves as a member of a particular group. Ironically, this is one piece of information that companies are most interested in to ensure that they target the right product to the right user. The most useful information is what a user does when online. Some of the obvious examples are purchasing choices that are indicative of gender such as a purse or a wallet. More subtle ones come from participation in groups that have an obvious bias such as a retiree’s discussion group or a visit to an ecological travel site. These actions when combined with photos that a person may have on his/her Flikr account or messages posted to online discussion groups can provide a more complete understanding of an individual. What is significant here is that the information doesn’t have to come from the user directly. The ability of Facebook users to tag a photo with all the people in it means that this information can be made available without the user taking any action. | 2
  3. 3. Historical Mistakes a solution that will maximize profits through the largest audience possible. The stakes are huge for companies to get identity and privacy right. Over the past few years, a number of high-profile incidents At the same time, users expect more of a personalized, intuitive where PII or SPI was accidentally revealed to the public have experience. It is when the user has a perception of value for been broadly publicized. what they reveal that they will really see an improvement in their In 2006, America On-Line released the records of 20 million experience. Users are willing to let Amazon track their purchasing search keywords from approximately 650,000 of its users done habits because they get better recommendations as a result. over a three-month period. While the users were not personally They provide accurate rating to Netflix in order to improve the identified, per se, their searches contained a wealth of PII. Within quality of the movies that it suggests to them. The key to all of this only a few days, New York Times journalists had determined the is making it obvious to the user that they are the ones benefiting. identities of many of the searchers, and with permission, revealed With this in mind it is important for web site operators to the identity of one of the users.1 That user, a 62-year-old Georgia remember that the personal data belongs to the end user. If the woman, had conducted over 300 searches that were traced back user perceives sufficient value for providing the information to her, some of which were embarrassing to her. The AOL incident they will readily reveal it. By forcing users to reveal information was devastating to the company. that they are not ready to, they will either provide inaccurate Similarly in 2006, Netflix released over 100 million movie ratings information or choose to go elsewhere – in either case the only made by 500,000 of the company’s subscribers. To protect its loss is to the web site polluting their trend data or losing the customers’ privacy, the data was made anonymous by removing any personal details. Only a few weeks later, Arvind Narayanan Custo and Vitaly Shmatikov announced that they had de-anonymized me s rL the data by comparing the data with publicly available ratings on a u lt ist es e movie database called the Internet Movie Database2. R ni ng or nit Most recently, Facebook faced an uproar of criticism over its Mo Beacon advertising program which pulls information from external websites and shares that information with Facebook users’ friends. Controversy swiftly followed Beacon’s launch Customer Driven Quality Improvement over privacy concerns because the mechanism to opt-in or out Process Defin of program was not clear. Fortunately for Facebook, the concern Bes over Beacon did not doom the program. In fact, it continues to eG t Pr operate today, but with a higher level of control given to end users oa ac to permit the sharing of their information. ls tic s e Customer Benefits Metrics While data gathering primarily provides feedback to advertisers and content providers about trends and product interest, it also provides a significant benefit to all users. When users express customer. A better approach would be to allow the user to clearly similar interests, content providers can respond by creating retain their privacy and stay with the site and opt for lesser quality new products or modifying old products to meet the newly recommendations. discovered interests. Eventually when the user hears about or otherwise realizes the This is most evident in the local grocery store which pays value of the sharing, they will gladly provide accurate information. extremely close attention to the aggregate buying habits of their In return these users expect that information to be kept private. customers in order to ensure that the right products are always It is when this trust is broken that users will react – when this on the shelves. No company wants to create a product for a single reaction becomes an uproar, the government gets involved and user or even track the habits of one person. They are looking for creates new laws to ensure that the privacy is protected. | 3
  4. 4. Privacy Laws and Central American nations tend to take a sectoral approach to privacy laws.5 6 Charter Communications notified affected customers, who could opt out Reactions Privacy laws and regulations continue of the program. However, public interest Privacy laws in the United States and groups claimed the opt-out system did to react to the marketplace, with new across the globe are inconsistent and not prevent users’ activities from being technologies and processes leading to continue to evolve. In contrast to the monitored. Two members of the United more stringent regulation. For instance, European Union, in the United States States House of Representatives wrote a the recent emergence of behavioral there is no over-arching privacy law in letter to Charter expressing their concern targeting has raised the ire of privacy place. Instead, the United States takes a that “[a]ny service to which a subscriber regulators. Service providers along with more laissez-faire approach that targets does not affirmatively subscribe and that two companies, Phorm in the UK, and specific sectors, relying on a combination can result in the collection of information Nebuad, in the US, have recently found of legislation, regulation, and self- about the web-related habits and themselves embroiled in controversy regulation. For example, U.S. laws are in interests of a subscriber, and achieves over plans to target customers with place to address medical privacy, financial any of these results with the ‘prior written advertisements based on their prior web institution privacy and children’s privacy. or electronic consent of the subscriber,’ surfing behaviors.7 8 Both companies raises substantial questions related to The EU has a comprehensive law4 planned to install deep packet inspection Section 631 [of the Communications reflecting the EU’s philosophy that equipment on ISP networks that would Act].” Behavioral targeting has advanced while data processing is beneficial, an monitor subscribers’ online activities, over the years to provide a much more individual’s fundamental privacy rights build behavioral profiles, and sell the complete view of users’ behaviors. While, must be protected. Many consider the EU profiles to advertisers who could use the the behavioral targeting industry has to have the most restrictive privacy laws profiles to deliver targeted ads. attempted to educate consumers on of any jurisdiction worldwide. Importantly, Privacy regulators in the EU and the benefits of having content tailored to the EU regulations are implemented by United States questioned whether the individuals, there are still many concerns each individual member state, which companies obtained informed consent over transparency, the ability to easily opt- has lead to different interpretations and from end users. BT deployed its system out, and how opt-out data is discarded. governing regulations. without the knowledge of affected As a result, regulator and lawmakers have While privacy has historically been given users. European Union Communications proposed legislation and regulation to low priority in Asia, economic concerns— Commissioner, Viviane Reding, address the privacy concerns around in particular, the desire to establish voiced her concern that the practice behavioral targeting. consumer trust in online commerce— breached the EU Privacy and Electronic have driven a surge in privacy there Communications Regulations 2003 (see “Asia: the new Thought Leader in (PECR)—which implement European Privacy?”). The Asia Pacific Economic Directives on wiretapping—saying “[i]t Cooperation group (APEC) approved a set is very clear in E.U. directives that unless of non-binding privacy principles to assist someone specifically gives authorization governments in passing comprehensive (to track consumer activity on the Web) privacy legislation in 2004. In contrast, then you don’t have the right to do that.” | 4
  5. 5. Stratification It is possible for the industry to meet all of these regulations, the desires of a company, and the needs of the user by taking a layered approach to the information about a user and the collective actions of a community. To accomplish this the concept of who a user is can be broken down into three levels. 1. Identity – Used for authentication and authorization 2. Profile (or Persona) – Used to describe an individual 3. Uniquity – Unique identifier used to collect actions Identity At the highest level is a user’s identity. This is how a user says “they are who they say they are”. It is often represented as a combination of a userid and a password, but also can be authenticated through identification cards, biometrics, certificates, encryption keys, or other security mechanisms. To a user, this is the most valuable thing that they have because if someone else gets it from them, the user stands to lose a lot. Given their value, these authentication credentials are a common target of Phishing attacks. This identity is often shared among multiple web sites – particularly when the default identity a site depends upon is an email id to which they send a verification message which requires no 3rd party involvement. The advent of OpenID technologies ensures that a user can use a common identity to access multiple sites. One weakness in using an email address/password combination to authenticate a user is that any compromised site may lead to a user’s identity being compromised on multiple sites. To a web site operator, the identity portion has very little value other than to authenticate the user. However, protecting it requires attention to security of not just the data but the actual mechanisms of authentication. This is necessary to give confidence to the end user that their personal information can’t be compromised. Profile Given the identity, the user has access to their profile(s) where all of their PII and information about their friends, interests, groups and preferences are stored. While this information is still valuable to the user, if it is hacked, someone can impersonate the user with the amount of risk based on how much SPI is taken. It is important to note that the relationship between an Identity and a Profile is one way. Given an Identity, it is possible to determine a profile, but starting from a profile does not yield the credentials that the user gave to create it. This one-way relationship also works in that a user may have multiple profiles based on the situation that they are in. For example, the user may choose to have a different public name or picture on a team’s fan site verses when they are on a cooking related site. The amount of information that they reveal in their profile can vary from site to site based on the user’s perceived value from the site. Furthermore, as a user creates multiple profiles for the different sites that they want to participate in, it is incumbent on the user to keep them in sync. Uniquity At the lowest level we propose the concept of uniquity that represents a collection of a user’s actions. It is important that it does not contain any PII. What it does contain is a collection of actions that an anonymous user has taken. Like the relationship between the Identity and the Profile, you cannot get | 5
  6. 6. back to the profile from uniquity. To a user, this has the least value. If someone gets a copy of the uniquity, the best that can be done is to imitate a random user. Implementing this requires attention to detail in ensuring that all of the PII is completely separated out from the actions and that the same one-way relationship is established. It also means that algorithms should operate on the clearly observed behaviors instead of the public face that they user has put up. Large scale trend analysis of uniquity reveals interests at large. A web site operator or a content producer can get a clear understanding of the interests of the community without breaching the privacy of those individuals. These trends even allow them to take into consideration those users who chose to remain anonymous. Analyzing uniquity, which ignores identity and profiles, provides benefits for the publisher while protecting users’ privacy, and providing the clear benefit of recommendations targeted at them. If they choose to remain anonymous, the quality of these recommendations is limited to the current session that they are in and the minimal information that they have chosen to provide. For the web site operator, beyond the benefits of complying with all government regulations, it also makes it easier to provide a custom experience that users come back to. Conclusion By carefully separating out the levels of information that are stored for a user, it is possible to meet even the strictest of government regulations while offering a clear value to the end user. Americas Headquarters Asia Pacific Headquarters Europe Headquarters Cisco Systems, Inc. Cisco Systems (USA) Pte. Ltd. Cisco Systems International BV San Jose, CA Singapore Amsterdam, The Netherlands Cisco has more than 200 offices worldwide. Addresses, phone numbers, and fax numbers are listed on the Cisco Website at CCDE, CCENT, Cisco Eos, Cisco HealthPresence, Cisco Lumin, Cisco Nexus, Cisco StadiumVision, Cisco TelePresence, Cisco WebEx, the Cisco logo, DCE, and Welcome to the Human Network are trademarks; Changing the Way We Work, Live, Play, and Learn and Cisco Store are service marks; and Access Registrar, Aironet, AsyncOS, Bringing the Meeting To You, Catalyst, CCDA, CCDP, CCIE, CCIP, CCNA, CCNP, CCSP, CCVP, Cisco, the Cisco Certified Internetwork Expert logo, Cisco IOS, Cisco Press, Cisco Systems, Cisco Systems Capital, the Cisco Systems logo, Cisco Unity, Collaboration Without Limitation, EtherFast, EtherSwitch, Event Center, Fast Step, Follow Me Browsing, FormShare, GigaDrive, HomeLink, Internet Quotient, IOS, iPhone, iQuick Study, IronPort, the IronPort logo, LightStream, Linksys, MediaTone, MeetingPlace, MeetingPlace Chime Sound, MGX, Networkers, Networking Academy, Network Registrar, PCNow, PIX, PowerPanels, ProConnect, ScriptShare, SenderBase, SMARTnet, Spectrum Expert, StackWise, The Fastest Way to Increase Your Internet Quotient, TransPath, WebEx, and the WebEx logo are registered trademarks of Cisco Systems, Inc. and/or its affiliates in the United States and certain other countries. All other trademarks mentioned in this document or website are the property of their respective owners. The use of the word partner does not imply a partnership relationship between Cisco and any other company. (0812R) © 2008 Cisco Systems, Inc. All rights reserved. | 6