This document provides an overview of online identity and privacy. It discusses what constitutes online identity across different platforms and how online identities are tracked through user data collection. Specific examples are given around how easily user data can be de-anonymized and linked back to individuals even when anonymized. The document then explores several alternative network architectures like Tor, I2P and Freenet that provide increased privacy and anonymity compared to the traditional internet. It argues these alternatives show an internet with different structures is possible.
2. INTRODUCTION
• PART I:What is Online Identity?
• PART II:The Persistence of Online Identity
• PART III:Theories of the Present and the Future
• PART IV:The Internet is not Immutable
1/4
2
3. INTRODUCTION
• The internet is an evolving experiment in ontology
• Putting reformists, revolutionaries, crypto-anarchists,
techies, and ‘normies’ (excuse the language) in one room
• Scare you
2/4
3
4. INTRODUCTION
• NOT a technical discussion or a how-to
• NOT a chronology (cause-effect fallacy)
• NOT a lecture (speak up, shout me down, talk amongst
yourselves)
• NOT a pro- or anti-internet talk (but those elements will be
there)
3/4
4
5. INTRODUCTION
• UC Berkeley student during anti-cuts ‘09
• Freelance coder
• I’m building an app/platform for local exploration
• Some of the things I say might be bad for my career
4/4
5
6. HOUSE OF MEDICI
The House of Medici is founded as a vehicle for
artist-developers. An agile design, development,
and investment firm, the house is dedicated to
commissioning, developing and facilitating
people whose designs and organizations
spearhead the movement towards creative,
prosperous, localized and vibrant cities.
6
“The creation continues incessantly through the media of man”
- Antonio Gaudi
7. PART I
What is Online Identity?
(if the medium is the message, what is the internet saying?)
7
8. SO WHAT AREYOU,
ONLINE?
• Several basic types of identity and interaction
• Messaging (static)
• Usenet/BBS/IRC/Forums (fluid)
• Social Networking (fixed)
• Content hubs (contextual)
• App-based interactions (proprietary)
• Putting all these together
• A highly modular communications medium
• Near infinite access and storage of information
• A common interface of metaphors
• Fundamentally imbued with the presence of capital and the state
1/5
8
10. MONEY AND THE
INTERNET
• Cables (continental, trans-atlantic & trans-pacific, specialty)
• Routing
• Switching
• Hardware
• Software
• Hosting
• Serving
• Accessing
• Spectrum ownership
• These are all ‘first-world’ costs
3/5
10
11. MONEY AND THE
INTERNET
• Military (efficiency)
• Pay-to-access (first party)
• Online transactions (second-party)
• Ads (third party)
• Drum roll… data (all-party)
4/5
11
12. ADDITIONAL QUESTIONS
• Is the internet centralized or decentralized, and
where/how?
• What is the nature of a ‘free’ service/platform
• Who can see what I do online, and to what
extent/form/etc. ?
• Should data be considered property, speech, or
both?
5/5
12
14. TRACKING
• Raw communication
• Facts (wiki-fication of the internet)
• Opinion (the blogosphere)
• Self
• Real-time
• Analytics (new to the team)
1/3
14
15. TRACKING
• The service, usually in exchange for being ‘free’ (but not always)
• Advertisers
• Social services, aggregators, and viral marketers
• Service providers
• The government
• Each other
2/4
15
16. TRACKING
• Using explicitly volunteered information (polls, ratings,
commentary)
• Reading the content we create or upload
• Tracking our use patterns (browsing, interacting, etc.) through
IDs and meta-data
• Building a social graph
• Testing machine-learning algorithms on us
3/4
16
17. TRACKING
• The paradox of consent
• Hannah Arendt’s natality, plurality, and visibility
• The internet as the new commons
• A kink in the system: the No-Network problem
4/4
17
18. DATA 1/5
18
• The result of tracking is data. So really we should be focusing
on that.
• Data collection is a legal, ethical, and technological ‘No Man’s
Land’
• The collection of data currently is the driving factor of market
growth of internet technologies (‘big data’)
19. DATA 2/5
19
• In 2006, Netflix announced a $1 Million prize for a better
content rating system. In order to help with the effort, they
released a data-set of 100 million ratings by a 1/2 million
subscribers.
• This data was ‘anonymized’
Case Study: Netflix Prize
20. DATA 3/5
20
• Researchers Arvind Narayanan andVitaly Shmatikov were able to ‘de-
anonymize’ the data, using pure math.
• By knowing as little as 2 of your movie ratings and the weeks that you
rated them, they could identify 70% of the dataset.
• By knowing 8 of your movie ratings, of which 2 could be false, and the
months that you rated them, they could identify 99% of the dataset.
• By cross-referencing ratings from a public service (IMDb), they could
identify an individual in the dataset
• The more rare a movie, the fewer other movies they needed to know
Case Study: Netflix Prize
21. DATA 4/5
21
• With ‘water cooler’ knowledge of your movie interests, they
could pick your data out of a set of millions of ratings/people.
• Our movie ratings often coincide with our political/social/
religious beliefs.What conclusions can be learned from
knowing the 100+ movies that I like?
• Cross-referencing datasets is immensely powerful
Case Study: Netflix Prize
22. DATA 5/5
22
• Data is a treasure trove inside a black box, and many are staking
their future on this data telling an accurate story
• This is a race to the bottom: the better they get at knowing us,
the less these companies will have to aggregate massive amounts
of data.
• It’s also a race to the top: the better they get at knowing us, the
fewer services we’ll ‘need’.
• Technology has exponential growth
• But, the data itself is closely guarded. A few specific companies
are in the position to ‘know’ the most things.
23. A DIFFERENT WAY?
• Connecting the dots between the ubiquity and ease of data
collection, the knowledge that this data provides, and the lack
of accountability paints a bleak picture
• Most, if not all people accept this as the future, including:
facial recognition, identity tracking, hyper-targeted advertising,
inference of beliefs.
• In part IV, we can discuss some very potent alternative
structures that dismantle these tools.
23
27. 27
THE FRIENDSHIP
PARADOX
2/3
• Observed on almost every social network, mathematically
derived, and empirically proven
• Also applies to: publishing papers, sexual partners
• Used in epidemiology for better subject choice
30. 30
THE RULES STILL APPLY
HERE
1/2
• Faucets and Sinks
• Real Money Auction House (RMAH)
• May 2012: $300/million
• Feb 2013: $0.20/million, March 2013: $0.05/million (hyper-
inflation is often defined as losing 50% of value in a month)
• May 2013: $.004/million (1/100,000th in 1 year)
33. THE FUTURE 1
33
Complete autonomy of capital : a mechanistic utopia where
human beings become simple accessories of an automated system,
though still retaining an executive role;
Mutation of the human being, or rather a change of the
species : production of a perfectly programmable being which has
lost all the characteristics of the species Homo sapiens. This would
not require an automatized system, since this perfect human being
would be made to do whatever is required;
Generalized lunacy : in the place of human beings, and on the
basis of their present limitations, capital realizes everything they
desire (normal or abnormal), but human beings cannot find
themselves and enjoyment continually lies in the future.The human
being is carried off in the run-away of capital, and keeps it going.
–Jacques Camatte, The Wandering of Humanity (1973)
34. 34
THE FUTURE 1
Collapse and Post-Collapse
• Catastrophe
• My opinion is that some kind of internet will eventually be re-created in
a post-catastrophe
• Revolution
• The current model of the internet still centralizes power, so a
revolutionary ‘decentralized’ or ‘worker-owned’ internet must be tangibly
different (more on that)
• Voluntary
• Primitivism
• But not necessarily
35. 35
THE FUTURE 1
Post-Internet
• Planes of Immanence (Deleuze & Guattari)
• Merger of the self with the other
• Comfort vs. Actualization
• To record or not to record?
36. 36
THE FUTURE 1
Neo-Internet
• Internet of Things
• Metaphor of the ‘smart’ wine-rack
• Camatte’s ‘managed’ utopia
• Virtual Reality / Augmented Reality
• Vacuous pleasure vs. Jouissance
• Camatte’s ‘generalized lunacy’
37. 37
THE FUTURE 1
Distant Future
• Solar system colonization
• Ironically, the internet will hold us back
• Post-humanism vs Trans-humanism
• Camatte’s ‘species mutation’
38. PART IV
The Internet is not Immutable
(i.e. alternative ‘net’s)
38 With help from: http://www.irongeek.com/i.php?page=videos/intro-to-tor-i2p-darknets
39. 39
ALTERNATIVE ‘NET’S
• Encryption: An alternative language
• Gesalt Theory: An alternative authority
• Mesh Networks: An alternative structure
• Tor: An alternative path
• I2P: An alternative channel
• Freenet: An alternative paradigm
42. 42
END-TO-END
ENCRYPTION (E2EE)
3/3
• Companies will say: it hinders the user experience
• The NSA keeps your PGP traffic in case they ever obtain your
private key
• The math behind cryptography is pretty complicated and it has
been subverted
• Certain countries force you to decrypt your stuff!
• As in the Netflix case study, anonymizing your data is hard, so it
doesn’t matter if its encrypted
43. 43
BIG DATA 1/3
• Can we design a database that:
• Collects tons of data about you (active)
• Can be used for actionable purposes like recommendation (valuable)
• But upon inspected, contains no useful information (non-authoritative)
• And when analyzed, can’t be de-anonymized (non-inferencing)
• GesaltTheory
44. 44
BIG DATA 2/3
• The database only contains restaurants in a single form: X,Y.
• X:The first letter of the name of the restaurant
• Y:The first letter of the name of the street where it is located.
• The database contains only groups (‘sets’) of restaurants
• I ‘query’ the database by giving it a ‘set’ of restaurants I like, and
the restaurant I am asking about.
• When the database receives a query, it stores the ‘set’, and
successfully gives me my recommendation.
Thought Experiment: Restaurant Recommendations
45. 45
BIG DATA 3/3
Thought Experiment: Restaurant Recommendations
• Pattern Recognition is counter-intuitive
• This scheme is NOT a cipher
• Doesn’t implicate any individual, nor any restaurant
• This database can be completely public
• This example is highly simplified, but conceptually powerful.
• This technique actually makes computation faster in some
situations
47. 47
MESH-NET 2/3
• Locality vs centrality
• Surveillance and censorship protection
• Community owns the tools of production
48. 48
MESH-NET 3/3
• Malicious nodes
• A mobile-phone mesh-net is harder (not impossible)
• We still use a few key services (Facebook, Google Maps) that
could be (and have been) tracked
• Big-money tech companies like mesh-networking!
50. 50
TOR-NET 2/3
• Security through “onion routing”
• Internet service providers (Comcast, etc) can’t see what you’re
doing
• Websites can’t see where you’re from
• Certain services can be “inside” the onion, which means
they’re nearly un-blockable
51. 51
TOR-NET 3/3
• A few (on the order of 1000s) of nodes = bottleneck
• There is still a trust dependency for tor nodes i.e. correlation
attacks
• Can’t use certain services, such as peer-to-peer file-sharing
• Again… metadata. Still building on top of the regular internet
53. 53
INVISIBLE INTERNET
PROJECT (12P)
2/3
• Security through “garlic routing” (great metaphor, eh?)
• Somewhat of a hybrid of Mesh and Tor
• No central infrastructure! How awesome
• Can create secure, encrypted channels (friend-to-friend)
• Actually better than Tor for hidden services because it was
designed for them, but also because it is self-organizing
• Can do peer-to-peer file sharing! Also, anonymous e-mail,
anonymous chat, etc.
54. 54
INVISIBLE INTERNET
PROJECT (12P)
3/3
• It’s slow
• Not designed for the greater internet (and less secure for that
stuff)
• Hasn’t been around as long as other services (so fewer eyes on
the code and fewer papers published).
• In other words, your-mileage-may-vary
55. 55
What if I wanted to do away with the
entire paradigm of the internet?
57. FREENET 1/3
57
SF
Europe
Africa
Oakland
• Each piece of data is a
unique number (‘hash’)
• Every node contains
some data, and the
location of the data
with the closest hashes.
• We ‘find’ data by
hopping from node to
node asking for our
number
58. 58
FREENET 2/3
• There’s no such thing as ‘location’ on the freenet, just a way to
find more and more closely matching names.
• This is NOT an internet. It’s technically a ‘distributed data-
store’
• No ‘users’ or ‘servers’ in the traditional sense.The system itself
‘stores’ the data.
• It’s all encrypted: in storage, in transit, everywhere. Not even the
person holding the data knows what it is.
59. 59
FREENET 3/3
• You CAN’T access the internet through the Freenet (it’s self-
contained)
• It’s slow (but the more connections there are, the faster it
gets)
• It’s “forgetful” (!)
• Wait a minute…the freenet is like a giant BRAIN!
• Unfortunately, the freenet doesn’t mix well with the law
60. AN INTERNET THAT’S
DIFFERENT™
• Monied interests are inevitable
• But WE, an intelligent, careful society, create a better future
• For certain things, centralization is good. For certain things,
decentralization is good (that’s the honest truth of technology)
• Transparency is key.Trust is key.
• Is this possible? Or is this the dream that powers the engine?
60