Algorithmic Accountability Reporting | Journalism Interactive 2014
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Algorithmic Accountability Reporting | Journalism Interactive 2014

on

  • 98 views

Talk on "Algorithmic Accountability Reporting" by Nicholas Diakopoulos, a computer scientist, Tow Fellow at the Columbia University Journalism School and incoming member of the faculty at the Philip ...

Talk on "Algorithmic Accountability Reporting" by Nicholas Diakopoulos, a computer scientist, Tow Fellow at the Columbia University Journalism School and incoming member of the faculty at the Philip Merrill College of Journalism.

Statistics

Views

Total Views
98
Views on SlideShare
97
Embed Views
1

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 1

http://www.slideee.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Almost 14 years ago Lawrence Lessig taught us that “Code is Law” – that the architecture of systems, the code and algorithms that run them, can be a powerful influence on liberty. Let me give you a quick example of algorithmic power.
  • There are now dozens of sites online that collect, organize, and SEO mug shot photos. Having a mugshot online can be embarrassing, and many of these sites are blackmailing people for money to have their photo removed.
  • As a society we can deal with this shady behavior through laws, or through market forces like credit card companies not processing payments. Or it can be mitigated with algorithmic power. Which is what Google just did, by down ranking any of these sites.
  • Here are some of the examples where t I’ve found algorithms being used in gov. and corporations. I think there’s a gold mine of stories out there. I’m wondering if algorithms could be a new beat for computational journalists?
  • Traditionally investigative journalism has looked at uncovering hidden information about institutions. Turns out that algorithms are really good at hiding and obfuscating information so there’s a natural fit for journalism here. What we lack as a public is clarity on how an algorithm exercises its power over us, given that power is opaque, hidden behind complexity in a black box. We need to get *inside* that box.
  • The crux of algorithmic power is really autonomous decision-making. We might start to assess algorithmic power by thinking about the atomic decisions that algorithms make. And they can be composed and composited to arrive at higher level operations like summarization. This framework can help identify what we might focus on when investigating an algorithm, and suggests questions: criteria, errors, biases in training data, editorial criteria. Prioritization: fire-inspections in new york. Parollee attention. Classification: contentID – infringing or notAssociation: relationship between entities in an investigationFiltering: Censorship on social media – e.g. chinese censorship or the filtering of child pornography from search results
  • Transparency is the vogue response these days and certainly an increasingly important way that journalists deal with their own bias. But there are some limitations and challenges to applying it to algorithms.
  • In reality the process is probably a bit closer to historiography (or archaeology): the 12th century historian’s sample of data is that which time has chosen to preserve in the form of written accounts, artifacts, and the archaeological record. You take what you can get and interviews are not an option. Mention: if interviews do become available with the designers it can form a powerful comparison point between the deduced and the expressed design intent and lead to insights about whether the system is performing “as designed
  • Algorithms may be black boxes, but they have two little holes in the side, one for inputting and one for outputting. And if you vary the inputs in enough different ways and pay close attention to the outputs, you can start piecing together a theory of how the algo works. Also like it with the analogy to recipes … Given a cake and a set of ingredients can you figure out the recipe to turn those ingredients into a similar cake?
  • At the WSJ they used reverse engineering to understand how online commerce sites like Staples dynamically set prices based on things like geography and browser history. This involved simulating thousands of surfing sessions and recording prices. Also: rosetta stone, orbitz, home depot
  • The tendency to see discount prices appeared to be tied most strongly to distance from an OfficeMax or Office Depot — Staples' main competitors.• ZIP Codes within about 20 miles of a competitor's store tended to see discount prices.• ZIP Codes farther away from a rival store tended to see higher prices, even more so if the ZIP Code contained a Staples store.• Cities (which were more likely to have Staples stores as well as competitor stores) tended to see discount prices in the Journal's tests as well.• The Journal also examined numerous other possible factors, including income and race, and found that none were tied as strongly to price as competitor-store locations.
  • The tendency to see discount prices appeared to be tied most strongly to distance from an OfficeMax or Office Depot — Staples' main competitors.• ZIP Codes within about 20 miles of a competitor's store tended to see discount prices.• ZIP Codes farther away from a rival store tended to see higher prices, even more so if the ZIP Code contained a Staples store.• Cities (which were more likely to have Staples stores as well as competitor stores) tended to see discount prices in the Journal's tests as well.• The Journal also examined numerous other possible factors, including income and race, and found that none were tied as strongly to price as competitor-store locations.What they found is that statistically speaking, the strongest correlation to price involved the distance to a rival’s store from the center of a zip code. Prices tend to be higher in areas with less competition, including rural or poor areas.
  • So what might make a story out of an algorithm? Maybe it’s discriminatory or unfair in some way. It makes a mistake that denies a service. It censors something. It’s output breaks a law or social norm. Or it falsely predicts something with real consequences.
  • So what might make a story out of an algorithm? Maybe it’s discriminatory or unfair in some way. It makes a mistake that denies a service. It censors something. It’s output breaks a law or social norm. Or it falsely predicts something with real consequences.
  • Can personalize trends, diff people see different trends, how are trends detected. What are the implications for publics forming around trends?
  • Filters are imperfectDe-emphasize legitimate reviewsLeave fake reviews intactYelp is a massive review platform, not just for restaurants, but for all kinds of small businesses. And business and depend on the start ratings and reviews that you get. In order to protect the consumer, so they have an algorithmic filter that weeds out suspicious reviews and de-emphasizes them on another page, making it hard to even see how the removed reviews aggregate, or would potentially contribute to the overall score. Explain nature of allegations – how you could have a low rating, with several high ratings that are hidden. a number of Russian YouTube videos have been blocked from within Germany. The reason? These videos contain background music playing from a Russian car radio.

Algorithmic Accountability Reporting | Journalism Interactive 2014 Presentation Transcript

  • 1. Algorithmic Accountability Reporting: On the Investigation of Black Boxes Nicholas Diakopoulos, Ph.D. Columbia University Journalism School (soon to be University of Maryland College of Journalism) @ndiakopoulos – http://www.nickdiakopoulos.com
  • 2. We should interrogate the architecture of cyberspace as we interrogate the code of Congress. -- Lawrence Lessig, Code is Law, 2000
  • 3. $$$ $$$ $$$ $$$ $$$ $$$ $$$ $$$
  • 4. Algorithms Are Everywhere
  • 5. Algorithmic Obfuscation Algorithms are opaque Technical complexity is a barrier
  • 6. Algorithmic Accountability How can we characterize the bias or power of an algorithm? When might algorithms be wronging us, or making consequential decisions? What role might journalists play in holding algorithmic power to account?
  • 7. Algorithmic Power: Decisions 1 2 3Prioritization Classification Association Filtering
  • 8. Transparency Voluntary incentives for self-disclosure about algorithms Trade secrets Including FOIA exception Gaming / manipulation Goodhart’s Law: “when a measure becomes a target, it ceases to be a good measure.” Cognitive complexity Transparency information needs to be accessible and understandable
  • 9. Adversarial Investigation Reverse Engineering “the process of extracting the knowledge or design blueprints from anything man-made” Systematic examination to unearth a model of how system works Uncover unintended side- effects as a result of implementation
  • 10. Output Input-Output of an Algorithm Input
  • 11. geo cookies prices Staples.com WSJ Price Discrimination Jennifer Valentino-DeVries, Jeremy Singer-Vine, and Ashkan Soltani. Websites Vary Prices, Deals Based on Users’ Information. Wall Street Journal. Dec, 2012. Price discrimination Do different people pay different prices depending on their geography or browser history? Yes!
  • 12. Discriminatory / Unfair Mistake that denies a service Censorship Breaks law or social norm False Prediction Other Stories from Algorithms?
  • 13. Teaching journalists to do algorithmic accountability It’s messy and hard! Legal issues EULAs, DMCA, Computer Fraud and Abuse Act Ethical implications of publishing more info Gaming, individual privacy Transparency policy What factors to expose, frequency, format of disclosure What’s Next?
  • 14. Thanks! Questions? Nick Diakopoulos Twitter: @ndiakopoulos Email: nicholas.diakopoulos@gmail.com Web: http://www.nickdiakopoulos.com More Info Algorithmic Accountability Reporting: On the Investigation of Black Boxes. Tow Center. Feb. 2014. http://towcenter.org/algorithmic-accountability-2/
  • 15. Algorithms In Media: Search Search Engine Autocomplete Google Autocomplete FAQ: “we exclude a narrow class of search queries related to pornography, violence, hate speech, and copyright infringement.” Editorial Criteria Boundaries of censorship, differences among search engines, mistakes?
  • 16. Algorithms In Media: Trends Implications for formation of publics? How are trends defined and measured? What might be missed as a result?
  • 17. Algorithms In Media: Filtering