The document discusses using crowdsourcing and human computation to summarize and edit documents. It proposes a "Find-Fix-Verify" model that separates tasks into identifying areas for improvement, editing the content, and verifying the edits. When tested on sample paragraphs, this approach was able to shorten the paragraphs by an average of 83-90% while maintaining overall meaning. However, it was observed that crowdsworkers performed best when removing unnecessary text rather than requiring domain knowledge.
Schemas for the Real World [Madison RubyConf 2013]Carina C. Zona
Social app development challenges us how to code for users’ personal world. Users are giving push-back to ill-fitted assumptions about their identity — including name, gender, sexual orientation, important relationships, and other attributes they value.
How can we balance users’ realities with an app’s business requirements?
Facebook, Google+, and others are grappling with these questions. Resilient approaches arise from an app’s own foundation. Discover schemas’ influence over codebase, UX, and development itself. Learn how we can use schemas to both inspire users and generate data we need as developers.
--
META
Where: Madison Ruby Conference 2013 (Madison, Wisconsin, USA)
Date: August 23, 2013
Video: http://www.confreaks.com/videos/2627-madisonruby2013-schemas-for-the-real-world
Eddi: Interactive Topic-Based Browsing of Social Status StreamsMichael Bernstein
Talk given at UIST 2010 by Michael Bernstein.
Twitter streams are on overload: active users receive hundreds of items per day, and existing interfaces force us to march through a chronologically-ordered morass to find tweets of interest. We present an approach to organizing a user's own feed into coherently clustered trending topics for more directed exploration. Our Twitter client, called Eddi, groups tweets in a user’s feed into topics mentioned explicitly or implicitly, which users can then browse for items of interest. To implement this topic clustering, we have developed a novel algorithm for discovering topics in short status updates powered by linguistic syntactic transformation and callouts to a search engine. An algorithm evaluation reveals that search engine callouts outperform other approaches when they employ simple syntactic transformation and backoff strategies. Active Twitter users evaluated Eddi and found it to be a more efficient and enjoyable way to browse an overwhelming status update feed than the standard chronological interface.
To mark Prince Charles’ 65th birthday we have published some key public opinion statistics about the Royal Family. The statistics show that 78% of respondents are satisfied with the job Prince Charles is doing, but William still leads the polls as the most popular member of the Royal Family.
Ipsos MORI: Scottish Public Opinion Monitor: December 2013Ipsos UK
Support for independence bounces back: As we enter the final nine months of campaigning before next year’s referendum, our latest poll for STV News will provide a boost for those arguing in favour of Scotland becoming an independent country. Among those certain to vote in next year’s referendum, 34% would vote ‘Yes’ if the referendum were held now (up by three percentage points from September 2013) while 57% would vote ‘No’ (down two points) and 10% are undecided.
The February Economist/Ipsos MORI issues index shows that, after January’s dead heat between the economy and race/immigration concern about the latter among Britons has fallen by 7 percentage points to 34%, meaning that the economy is once again uncontested as the most important issue facing Britain today. Poll: http://www.ipsos-mori.com/researchpublications/researcharchive/3346/EconomistIpsos-MORI-February-2014-Issues-Index.aspx
Schemas for the Real World [Madison RubyConf 2013]Carina C. Zona
Social app development challenges us how to code for users’ personal world. Users are giving push-back to ill-fitted assumptions about their identity — including name, gender, sexual orientation, important relationships, and other attributes they value.
How can we balance users’ realities with an app’s business requirements?
Facebook, Google+, and others are grappling with these questions. Resilient approaches arise from an app’s own foundation. Discover schemas’ influence over codebase, UX, and development itself. Learn how we can use schemas to both inspire users and generate data we need as developers.
--
META
Where: Madison Ruby Conference 2013 (Madison, Wisconsin, USA)
Date: August 23, 2013
Video: http://www.confreaks.com/videos/2627-madisonruby2013-schemas-for-the-real-world
Eddi: Interactive Topic-Based Browsing of Social Status StreamsMichael Bernstein
Talk given at UIST 2010 by Michael Bernstein.
Twitter streams are on overload: active users receive hundreds of items per day, and existing interfaces force us to march through a chronologically-ordered morass to find tweets of interest. We present an approach to organizing a user's own feed into coherently clustered trending topics for more directed exploration. Our Twitter client, called Eddi, groups tweets in a user’s feed into topics mentioned explicitly or implicitly, which users can then browse for items of interest. To implement this topic clustering, we have developed a novel algorithm for discovering topics in short status updates powered by linguistic syntactic transformation and callouts to a search engine. An algorithm evaluation reveals that search engine callouts outperform other approaches when they employ simple syntactic transformation and backoff strategies. Active Twitter users evaluated Eddi and found it to be a more efficient and enjoyable way to browse an overwhelming status update feed than the standard chronological interface.
To mark Prince Charles’ 65th birthday we have published some key public opinion statistics about the Royal Family. The statistics show that 78% of respondents are satisfied with the job Prince Charles is doing, but William still leads the polls as the most popular member of the Royal Family.
Ipsos MORI: Scottish Public Opinion Monitor: December 2013Ipsos UK
Support for independence bounces back: As we enter the final nine months of campaigning before next year’s referendum, our latest poll for STV News will provide a boost for those arguing in favour of Scotland becoming an independent country. Among those certain to vote in next year’s referendum, 34% would vote ‘Yes’ if the referendum were held now (up by three percentage points from September 2013) while 57% would vote ‘No’ (down two points) and 10% are undecided.
The February Economist/Ipsos MORI issues index shows that, after January’s dead heat between the economy and race/immigration concern about the latter among Britons has fallen by 7 percentage points to 34%, meaning that the economy is once again uncontested as the most important issue facing Britain today. Poll: http://www.ipsos-mori.com/researchpublications/researcharchive/3346/EconomistIpsos-MORI-February-2014-Issues-Index.aspx
Talk given at UIST 2010.
This paper introduces architectural and interaction patterns for integrating crowdsourced human contributions directly into user interfaces. We focus on writing and editing, complex endeavors that span many levels of conceptual and pragmatic activity. Authoring tools offer help with pragmatics, but for higher-level help, writers commonly turn to other people. We thus present Soylent, a word processing interface that enables writers to call on Mechanical Turk workers to shorten, proofread, and otherwise edit parts of their documents on demand. To improve worker quality, we introduce the Find-Fix-Verify crowd programming pat- tern, which splits tasks into a series of generation and review stages. Evaluation studies demonstrate the feasibility of crowdsourced editing and investigate questions of reliability, cost, wait time, and work time for edits.
From Natural Language Processing to Artificial IntelligenceJonathan Mugan
Overview of natural language processing (NLP) from both symbolic and deep learning perspectives. Covers tf-idf, sentiment analysis, LDA, WordNet, FrameNet, word2vec, and recurrent neural networks (RNNs).
In 1971, David Parnas wrote the great paper, "On the criteria to be used decomposing the system into parts," and yet the problem of breaking down big projects into small parts that work well together remains a struggle in the industry. The ability to decompose a problem space and in turn, compose a solution is essential to our work.
Things have gotten worse since 1971. With microservices, big data, and streaming systems, we're all going to be distributed systems engineers sooner or later. In distributed systems, effective decomposition has an even greater impact on the reliability, performance, and availability of our systems as it determines the frequency and weight of communication in the system.
This talk speaks to the essential considerations for defining and evaluating boundaries and behaviors in large-scale distributed systems. It will touch on topics such as bulkhead design and architectural evolution.
Presentation discusses scientific method, common pitfalls of social media experiments. Defines some terms, shows neat tools, tries to move discussion forward.
Social media and mobile devices have combined to help create the always-with-us, always-on, always-connected campus. Not just student-to-student but, importantly, institution/faculty/staff-to-student as well as staff-to-staff. We need to look beyond the silo-ed, one-way web sites of the past towards more personal, two-way applications that take advantage of this sea change on campus. The ways in which our users will want to interact with us, the types of tasks they’ll want to complete, and the types of devices we’ll want to deliver to will just continue to proliferate.
Now is the time to reevaluate.
Using lessons learned at a large land-grant institution we’ll look at what the future friendly campus might look like, ways to plant the seed of that change and tips on how to accomplish it.
This presentation was given at the 2012 .eduGuru Summit on April 11, 2012.
There's an old joke that goes, “The two hardest things in programming are cache invalidation, naming things, and off-by-one errors.” In this talk, we'll discuss the subtle art of naming things – a practice we do every day but rarely talk about.
Researchers, Discovery and the Internet: What Next?David Smith
A web2.0 issues and implications overview I put together for the Research Information Network as part of their workshop on researchers and discovery services.
http://www.rin.ac.uk/discovery-services-workshop
Quantifying the Invisible Audience in Social NetworksMichael Bernstein
Presented at CHI 2013
When you share content in an online social network, who is listening? Users have scarce information about who actually sees their content, making their audience seem invisible and difficult to estimate. However, understanding this invisible audience can impact both science and design, since perceived audiences influence content production and self-presentation online. In this paper, we combine survey and large-scale log data to examine how well users’ perceptions of their audience match their actual audience on Facebook. We find that social media users consistently underestimate their audience size for their posts, guessing that their audience is just 27% of its true size. Qualitative coding of survey responses reveals folk theories that attempt to reverse-engineer audience size using feedback and friend count, though none of these approaches are particularly accurate. We analyze audience
logs for 222,000 Facebook users’ posts over the course of one month and find that publicly visible signals — friend count, likes, and comments — vary widely and do not strongly indicate the audience of a single post. Despite the variation, users typically reach 61% of their friends each month. Together, our results begin to reveal the invisible undercurrents of audience attention and behavior in online social networks.
Paper at http://hci.stanford.edu/publications/2013/CrowdWork/futureofcrowdwork-cscw2013.pdf
Paid crowd work offers remarkable opportunities for
improving productivity, social mobility, and the global
economy by engaging a geographically distributed
workforce to complete complex tasks on demand and at
scale. But it is also possible that crowd work will fail to
achieve its potential, focusing on assembly-line piecework.
Can we foresee a future crowd workplace in which we
would want our children to participate? This paper frames
the major challenges that stand in the way of this goal.
Drawing on theory from organizational behavior and
distributed computing, as well as direct feedback from
workers, we outline a framework that will enable crowd
work that is complex, collaborative, and sustainable. The
framework lays out research challenges in twelve major
areas: workflow, task assignment, hierarchy, real-time
response, synchronous collaboration, quality control,
crowds guiding AIs, AIs guiding crowds, platforms, job
design, reputation, and motivation.
More Related Content
Similar to HarambeeNet: Data by the people, for the people
Talk given at UIST 2010.
This paper introduces architectural and interaction patterns for integrating crowdsourced human contributions directly into user interfaces. We focus on writing and editing, complex endeavors that span many levels of conceptual and pragmatic activity. Authoring tools offer help with pragmatics, but for higher-level help, writers commonly turn to other people. We thus present Soylent, a word processing interface that enables writers to call on Mechanical Turk workers to shorten, proofread, and otherwise edit parts of their documents on demand. To improve worker quality, we introduce the Find-Fix-Verify crowd programming pat- tern, which splits tasks into a series of generation and review stages. Evaluation studies demonstrate the feasibility of crowdsourced editing and investigate questions of reliability, cost, wait time, and work time for edits.
From Natural Language Processing to Artificial IntelligenceJonathan Mugan
Overview of natural language processing (NLP) from both symbolic and deep learning perspectives. Covers tf-idf, sentiment analysis, LDA, WordNet, FrameNet, word2vec, and recurrent neural networks (RNNs).
In 1971, David Parnas wrote the great paper, "On the criteria to be used decomposing the system into parts," and yet the problem of breaking down big projects into small parts that work well together remains a struggle in the industry. The ability to decompose a problem space and in turn, compose a solution is essential to our work.
Things have gotten worse since 1971. With microservices, big data, and streaming systems, we're all going to be distributed systems engineers sooner or later. In distributed systems, effective decomposition has an even greater impact on the reliability, performance, and availability of our systems as it determines the frequency and weight of communication in the system.
This talk speaks to the essential considerations for defining and evaluating boundaries and behaviors in large-scale distributed systems. It will touch on topics such as bulkhead design and architectural evolution.
Presentation discusses scientific method, common pitfalls of social media experiments. Defines some terms, shows neat tools, tries to move discussion forward.
Social media and mobile devices have combined to help create the always-with-us, always-on, always-connected campus. Not just student-to-student but, importantly, institution/faculty/staff-to-student as well as staff-to-staff. We need to look beyond the silo-ed, one-way web sites of the past towards more personal, two-way applications that take advantage of this sea change on campus. The ways in which our users will want to interact with us, the types of tasks they’ll want to complete, and the types of devices we’ll want to deliver to will just continue to proliferate.
Now is the time to reevaluate.
Using lessons learned at a large land-grant institution we’ll look at what the future friendly campus might look like, ways to plant the seed of that change and tips on how to accomplish it.
This presentation was given at the 2012 .eduGuru Summit on April 11, 2012.
There's an old joke that goes, “The two hardest things in programming are cache invalidation, naming things, and off-by-one errors.” In this talk, we'll discuss the subtle art of naming things – a practice we do every day but rarely talk about.
Researchers, Discovery and the Internet: What Next?David Smith
A web2.0 issues and implications overview I put together for the Research Information Network as part of their workshop on researchers and discovery services.
http://www.rin.ac.uk/discovery-services-workshop
Quantifying the Invisible Audience in Social NetworksMichael Bernstein
Presented at CHI 2013
When you share content in an online social network, who is listening? Users have scarce information about who actually sees their content, making their audience seem invisible and difficult to estimate. However, understanding this invisible audience can impact both science and design, since perceived audiences influence content production and self-presentation online. In this paper, we combine survey and large-scale log data to examine how well users’ perceptions of their audience match their actual audience on Facebook. We find that social media users consistently underestimate their audience size for their posts, guessing that their audience is just 27% of its true size. Qualitative coding of survey responses reveals folk theories that attempt to reverse-engineer audience size using feedback and friend count, though none of these approaches are particularly accurate. We analyze audience
logs for 222,000 Facebook users’ posts over the course of one month and find that publicly visible signals — friend count, likes, and comments — vary widely and do not strongly indicate the audience of a single post. Despite the variation, users typically reach 61% of their friends each month. Together, our results begin to reveal the invisible undercurrents of audience attention and behavior in online social networks.
Paper at http://hci.stanford.edu/publications/2013/CrowdWork/futureofcrowdwork-cscw2013.pdf
Paid crowd work offers remarkable opportunities for
improving productivity, social mobility, and the global
economy by engaging a geographically distributed
workforce to complete complex tasks on demand and at
scale. But it is also possible that crowd work will fail to
achieve its potential, focusing on assembly-line piecework.
Can we foresee a future crowd workplace in which we
would want our children to participate? This paper frames
the major challenges that stand in the way of this goal.
Drawing on theory from organizational behavior and
distributed computing, as well as direct feedback from
workers, we outline a framework that will enable crowd
work that is complex, collaborative, and sustainable. The
framework lays out research challenges in twelve major
areas: workflow, task assignment, hierarchy, real-time
response, synchronous collaboration, quality control,
crowds guiding AIs, AIs guiding crowds, platforms, job
design, reputation, and motivation.
Analytic Methods for Optimizing Realtime CrowdsourcingMichael Bernstein
Collective Intelligence 2012
Realtime crowdsourcing research has demonstrated that it is possible to recruit paid crowds within seconds by managing a small, fast-reacting worker pool. Realtime crowds enable crowd-powered systems that respond at interactive speeds: for example, cameras, robots and instant opinion polls. So far, these techniques have mainly been proof-of-concept prototypes: research has not yet attempted to understand how they might work at large scale or optimize their cost/performance trade-offs. In this paper, we use queueing theory to analyze the retainer model for realtime crowdsourcing, in particular its expected wait time and cost to requesters. We provide an algorithm that allows requesters to minimize their cost subject to performance requirements. We then propose and analyze three techniques to improve performance: push notifications, shared retainer pools, and precruitment, which involves recalling retainer workers before a task actually arrives. An experimental validation finds that precruited workers begin a task 500 milliseconds after it is posted, delivering results below the one-second cognitive threshold for an end-user to stay in flow.
4chan and /b/: An Analysis of Anonymity and Ephemerality in a Large Online Co...Michael Bernstein
Paper presented at ICWSM 2011
http://projects.csail.mit.edu/chanthropology
We present two studies of online ephemerality and anonymity based on the popular discussion board /b/ at 4chan.org: a website with over 7 million users that plays an influential role in Internet culture. Although researchers and practitioners often assume that user identity and data permanence are central tools in the design of online communities, we explore how /b/ succeeds de- spite being almost entirely anonymous and extremely ephemeral. We begin by describing /b/ and performing a content analysis that suggests the community is dominated by playful exchanges of images and links. Our first study uses a large dataset of more than five million posts to quantify ephemerality in /b/. We find that most threads spend just five seconds on the first page and less than five minutes on the site before expiring. Our sec- ond study is an analysis of identity signals on 4chan, finding that over 90% of posts are made by fully anonymous users, with other identity signals adopted and discarded at will. We describe alternative mechanisms that /b/ participants use to establish status and frame their interactions.
FeedMe: Enhancing Directed Content Sharing on the WebMichael Bernstein
To find interesting, personally relevant web content, people rely on friends and colleagues to pass links along as they encounter them. In this paper, we study and augment link-sharing via e-mail, the most popular means of sharing web content today. Armed with survey data indicating that active sharers of novel web content are often those that actively seek it out, we developed FeedMe, a plug-in for Google Reader that makes directed sharing of content a more salient part of the user experience. FeedMe recommends friends who may be interested in seeing content that the user is viewing, provides information on what the recipient has seen and how many emails they have received recently, and gives recipients the opportunity to provide lightweight feedback when they appreciate shared content. FeedMe introduces a novel design space within mixed-initiative social recommenders: friends who know the user voluntarily vet the material on the user’s behalf. We performed a two-week field experiment (N=60) and found that FeedMe made it easier and more enjoyable to share content that recipients appreciated and would not have found otherwise.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
HarambeeNet: Data by the people, for the people
1. Data by the people, for the peoplePowering Interactions via the Social Web Michael Bernsteinmitcsail | user interface design group | haystack group mit human-computer interaction
2. Computer Science “In the most basic sense, a network is any collection of objects in which some pairs of these objects are connected by links.” - Easley and Kleinberg, page 2 [Zachary ‘77, via Easley and Kleinberg ‘10]
3. With the abstraction, we can: - Reason at high levels - Make predictions - Interact online - Model data http://www.flickr.com/marc_smith
4. Social Science “The analysis of patterns of social relationship in the group is then conducted on the graph, which is merely a shorthand representation of the ethnographic data.” - Zachary ‘77 [Zachary ‘77, via Easley and Kleinberg ‘10]
5. Methodological mismatch Many of you are sitting on terabytes of data about human interactions. The opportunities to scrape data – or more politely, leverage APIs – are also unprecedented. And folks are buzzing around wondering what they can do with all of the data they've got their hands on. But in our obsession with Big Data, we've forgotten to ask some of the hard critical questions about what all this data means and how we should be engaging with it. - danahboyd, WWW ‘10
6. Methodological mismatch Many of you are sitting on terabytes of data about human interactions. The opportunities to scrape data – or more politely, leverage APIs – are also unprecedented. And folks are buzzing around wondering what they can do with all of the data they've got their hands on. But in our obsession with Big Data, we've forgotten to ask some of the hard critical questions about what all this data means and how we should be engaging with it. - danahboyd, WWW ‘10
18. Blog – 83% Print publishers are in a tizzy over Apple’s new iPad because they hope to finally be able to charge for their digital editions. But in order to get people to pay for their magazine and newspaper apps, they are going to have to offer something different that readers cannot get at the newsstand or on the open Web. Classic uist– 87% The metaDESK effort is part of the larger Tangible Bits project. The Tangible Bits vision paper, which introduced the metaDESKalong withand two companion platforms, the transBOARD and ambientROOM. Draft uist– 90% In this paper we argue that it is possible and desirable to combine the easy input affordances of text with the powerful retrieval and visualization capabilities of graphical applications. We present WenSo, a tool thatwhich uses lightweight text input to capture richly structured information for later retrieval and navigation in a graphical environment.. Rambling E-mail – 78% A previous board member, Steve Burleigh, created our web site last year and gave me alot of ideas. For this year, I found a web site called eTeamZ that hosts web sites for sports groups. Check out our new page: […] Technical Computer Science – 82% Figure 3 shows the pseudocode that implements this design for Lookup. FAWN-DS extracts two fields from the 160-bit key: the i low order bits of the key(the index bits) and the next 15 low order bits (the key fragment).
19. Crowdproof: Human Proofreading Finds errors that AIs miss, explains the reason behind the problem in plain English, and suggests fixes
20. The Human Macro Macro scripting without programming ‘‘Please change text in document from past tense to present tense.’’ I gave one final glance around before descending from the barrow. As I did so, my eye caught something […] I give one final glance around before descending from the barrow. As I do so, my eye catches something […]
21. The Human Macro Macro scripting without programming ‘‘Pick out keywords from the paragrah like Yosemite, rock, half dome, park. Go to a site which hsa CC licensed images […]’’ When I first visited Yosemite State Park in California, I was a boy. I was amazed by how big everything was […] http://commons.wikimedia.org/wiki/File:03_yosemite_half_dome.jpg
22. The Human Macro Macro scripting without programming ‘‘Hi, please find the bibtex references for the 3 papers in brackets. You can located these by Google Scholar searches and clicking on bibtex.” Duncan and Watts [Duncan and watts HCOMP 09 anchoring] found that Turkers will do more work when you pay more, but that the quality is no higher. @conference { title={{Financial incentives […]}}, author={Mason, W. and Watts, D.J.}, booktitle={HCOMP ‘09}, […] }
23. Programming Crowd Workers Rule of Thumb: 30% of worker effort on open-ended tasks will have an error in it Two useful personas: The Lazy Turker and The Eager Beaver
24. The Lazy Turker Does as little work as necessary to be paid The theme of loneliness features throughout many scenes in Of Mice and Men and is often the dominant theme of sections during this story. This theme occurs during many circumstances but is not present from start to finish. In my mind for a theme to be pervasive is must be present during every element of the story. There are many themes that are present most of the way through such as sacrifice, friendship and comradship. But in my opinion there is only one theme that is present from beginning to end, this theme is pursuit of dreams.
25. The Lazy Turker Does as little work as necessary to be paid The theme of loneliness features throughout many scenes in Of Mice and Men and is often the dominant theme of sections during this story. This theme occurs during many circumstances but is not present from start to finish. In my mind for a theme to be pervasive is must be present during every element of the story. There are many themes that are present most of the way through such as sacrifice, friendship and comradeship. But in my opinion there is only one theme that is present from beginning to end, this theme is pursuit of dreams.
26. The Lazy Turker Does as little work as necessary to be paid The theme of loneliness features throughout many scenes in Of Mice and Men and is often the dominant theme of sections during this story. This theme occurs during many circumstances but is not present from start to finish. In my mind for a theme to be pervasive is must be present during every element of the story. There are many themes that are present most of the way through such as sacrifice, friendship and comradship. But in my opinion there is only one theme that is present from beginning to end, this theme is pursuit of dreams.
27. The Eager Beaver Go beyond task requirements to be helpful, but introduce errors in the process The theme of loneliness features throughout many scenes in Of Mice and Men and is often the dominant theme of sections during this story. This theme occurs during many circumstances but is not present from start to finish. In my mind for a theme to be pervasive is must be present during every element of the story. There are many themes that are present most of the way through such as sacrifice, friendship and comradship. But in my opinion there is only one theme that is present from beginning to end, this theme is pursuit of dreams.
28. The theme of loneliness features throughout many scenes in Of Mice and Men and is often the dominant theme of sections during this story. This theme occurs during many circumstances but is not present from start to finish. In my mind for a theme to be pervasive is must be present during every element of the story. There are many themes that are present most of the way through such as sacrifice, friendship and comradeship. But in my opinion there is only one theme that is present from beginning to end, this theme is pursuit of dreams. The Eager Beaver Go beyond task requirements to be helpful, but introduce errors in the process
29. Find-Fix-Verify A design pattern that controls the efforts of the Lazy Turker and the Eager Beaver Separates open-ended tasks into three stageswhere each worker makes a clear contribution
30. Find “Identify at least one area that can be shortened without changing the meaning of the paragraph.” Independent voting to identify patches Fix “Edit the highlighted section to shorten its length without changing the meaning of the paragraph.” Soylent, a prototype... Randomize order of suggestions Verify “Choose at least one rewrite that has significant style errors in it. Choose at least one rewrite that significantly changes the meaning of the sentence.”
31. Why Find-Fix-Verify? Why split Find and Fix? Force Lazy Turkers to work on a problem of our choice Allows us to merge work completed in parallel Why Add Verify? Quality raises when we put Turkers at odds with each other Trade off lag time with quality
32. Data is made of people, Data is made by people, Data is made for people.
33. Collaborators Rob Miller, David Karger, Greg Little, Katrina Panovich, David Crowell Mark Ackerman Björn Hartmann …and about 9000 Turkers. I am generously kept off the streets by an NSF GRFP and NSF award IIS-0712793.
34. Blog Print publishers are in a tizzy over Apple’s new iPad because they hope to finally be able to charge for their digital editions. But in order to get people to pay for their magazine and newspaper apps, they are going to have to offer something different that readers cannot get at the newsstand or on the open Web. Classic uist The metaDESK effort is part of the larger Tangible Bits project. The Tangible Bits vision paper introduced the metaDESK along with two companion platforms, the transBOARD and ambientROOM. Draft uist In this paper we argue that it is possible and desirable to combine the easy input affordances of text with the powerful retrieval and visualization capabilities of graphical applications. We present WenSo, a tool that uses lightweight text input to capture richly structured information for later retrieval and navigation in a graphical environment.. Rambling E-mail A previous board member, Steve Burleigh, created our web site last year and gave me alot of ideas. For this year, I found a web site called eTeamZ that hosts web sites for sports groups. Check out our new page: […] Highly Technical Writing Figure 3 shows the pseudocode that implements this design for Lookup. FAWN-DS extracts two fields from the 160-bit key: the i low order bits of the key (the index bits) and the next 15 low order bits (the key fragment).
35. Blog – 83% Print publishers are in a tizzy over Apple’s new iPad because they hope to finally be able to charge for their digital editions. But in order to get people to pay for their magazine and newspaper apps, they are going to have to offer something different that readers cannot get at the newsstand or on the open Web. Classic uist– 87% The metaDESK effort is part of the larger Tangible Bits project. The Tangible Bits vision paper, which introduced the metaDESKalong withand two companion platforms, the transBOARD and ambientROOM. Draft uist– 90% In this paper we argue that it is possible and desirable to combine the easy input affordances of text with the powerful retrieval and visualization capabilities of graphical applications. We present WenSo, a tool thatwhich uses lightweight text input to capture richly structured information for later retrieval and navigation in a graphical environment.. Rambling E-mail – 78% A previous board member, Steve Burleigh, created our web site last year and gave me alot of ideas. For this year, I found a web site called eTeamZ that hosts web sites for sports groups. Check out our new page: […] Technical Computer Science – 82% Figure 3 shows the pseudocode that implements this design for Lookup. FAWN-DS extracts two fields from the 160-bit key: the i low order bits of the key(the index bits) and the next 15 low order bits (the key fragment).
36. Average Performance Cost: $1.41 per paragraph $0.55 to Find an average of two patches $0.48 to Fix each patch $0.38 to Verify the results Time: Wait : median 18.5 minutes (Q1 = 8.3 min, Q3 = 41.6 min) Work: median 2.0 minutes (Q1 = 60 sec, Q3 = 3.6 min)
37. Qualitative Observations Works best with unnecessary text […] they are going to have to offer something different […] Lack of domain knowledge[…] In this paper we argue that tangible interfaces […] Parallel edits can be inconsistent FAWN-DS extracts two fields from the 160-bit key: the i low order bits of the key (the index bits) and the next 15 low order bits (the key fragment).
Editor's Notes
When we're talking about social networks in computer science education, we have two methodological traditions to fuse. One is computer science, which we can see here through the lens of network science. It puts the network primary. Here is the first figure in the Easley and Kleinberg textbook, of a 34-person karate club.
This is an appealing definition and approach, because it provides a mathematical formalism that enables us to derive proofs, reason about groups at high levels and write interactive systems like Facebook. It doesn’t matter than friendship is a fuzzy concept: so long as both parties have agreed that it’s an undirected edge, we can do friend recommendation, build a news feed, and compute tie strengths (or as Facebook calls it, EdgeRank). It’s a very top-down approach, because computer scientists are good at dealing with lots of data.
The other strong tradition in this space is characterized by social science: social psychology, sociology, cultural anthropology, and the broad spectrum of ideas and methodologies encompassed by conferences like CSCW. Where computer science approach may put the network primary, social science puts the person primary. The goal of this approach is to understand why those links form, what they mean, and how they are utilized. This can be very bottom-up: social psychology, for instance, tends to take the individual as the unit of analysis. It asks questions like, “Why do groups form and split?”
When cultures collide, if we naively follow our methodological training, expectations get mismanaged. In her keynote at WWW, danahboyd critiqued the approach that many computer scientists take when they consider network problems:[quote]danah is referencing ethical and privacy questions largely, but there is an even bigger implication for computer science in my mind: we cannot write crowd programs without really knowing what it is that the crowd is doing.
When cultures collide, if we naively follow our methodological training, expectations get mismanaged. In her keynote at WWW, danahboyd critiqued the approach that many computer scientists take when they consider network problems:[quote]danah is referencing ethical and privacy questions largely, but there is an even bigger implication for computer science in my mind: we cannot write crowd programs without really knowing what it is that the crowd is doing.
danah would talk about the de-anonymization of the Netflix dataset. I have another angle on the situation: understanding humans was what ultimately won the million dollars. Basic collaborative filtering techniques can get you so far. But one of the techniques that BellKor’s Pragmatic Chaos used was temporality: it turns out that when people rate a bunch of movies at a time, they tend to be movies that they saw a long time ago. And those kinds of movies exhibit a specific kind of rating pattern. The authors speculate, but I think this has to do with cognitive psychology: that we are much more likely to remember events with high emotional arousal than those without, and more likely to remember remember positive events than negative events.
danah would talk about the de-anonymization of the Netflix dataset. I have another angle on the situation: understanding humans was what ultimately won the million dollars. Basic collaborative filtering techniques can get you so far. But one of the techniques that BellKor’s Pragmatic Chaos used was temporality: it turns out that when people rate a bunch of movies at a time, they tend to be movies that they saw a long time ago. And those kinds of movies exhibit a specific kind of rating pattern. The authors speculate, but I think this has to do with cognitive psychology: that we are much more likely to remember events with high emotional arousal than those without, and more likely to remember remember positive events than negative events.
So it is when we program systems involving networks and crowds. We have a lot of data, and even more interest in that data, as demonstrated by the number of influential and award-winning papers that have been written by the amazing people sitting in front of me right now. When we talk about data, we are fundamentally bridging the attractive networks abstraction and the equally attractive social science abstraction. When we’re successful like BellKor’s Pragmatic Chaos was, it takes us farther than either process in isolation.
I’m a social computing systems builder: I build interfaces that are powered by social data and interfaces that encourage social interaction. To do this well, I have to get this balance right. I want to share with you a few ways in which I’ve been using the social web to develop new tools, and the ways in which we have wrestled with humans and algorithms simultaneously to make them work.
For years, human-computer interaction researchers have used Wizard of Oz techniques to prototype interactive systems. This technique typically meant having one of the design team members behind a curtain simulating parts of an artificial intelligence that hadn’t been built yet. But, we now have artificial intelligence for hire via services like Amazon Mechanical Turk, where you can pay cents for workers largely in the U.S. and India to perform tasks for you. The Soylent project asks: what happens when you embed those workers inside of an interface -- when you have a Wizard of Turk? Can we help end users when interfaces aren’t necessarily bound by AI-hard problems any more, but by humans?Here are a few preliminary thoughts, which will show up at the ACM UIST conference this year.
For years, human-computer interaction researchers have used Wizard of Oz techniques to prototype interactive systems. This technique typically meant having one of the design team members behind a curtain simulating parts of an artificial intelligence that hadn’t been built yet. But, we now have artificial intelligence for hire via services like Amazon Mechanical Turk, where you can pay cents for workers largely in the U.S. and India to perform tasks for you. The Soylent project asks: what happens when you embed those workers inside of an interface -- when you have a Wizard of Turk? Can we help end users when interfaces aren’t necessarily bound by AI-hard problems any more, but by humans?Here are a few preliminary thoughts, which will show up at the ACM UIST conference this year.
We are focused on writing. We’ve learned to write since grade school; it’s the stock and trade of how most of us exchange ideas today. I think we can all agree that writing is hard. Even seasoned experts will make mistakes: non-parallel constructions, typos, or just plain being unclear. If we make a high level decision like changing a story from past tense to present tense or shifting references from ACM format to MLA format, we have to execute a daunting number of tasks. And of course, when we have that 10-page limit and our paper is 11 pages, we spend hours whittling our writing down to size.
Let me shift to the data aspects of this. To make these interfaces, we need algorithms with human callouts in them. But, we don’t really know how to do this yet. Turkers are people, and using an extrinsic motivation like payment can lead to weird effects. We’ve created two useful personas that guide our work: