Layman&#x27;s Talk: Entities of Interest --- Discovery in Digital Traces

Why people stop using sina weibo?

Aoran Yang

Privacy-preserving Data Mining in Industry (WSDM 2019 Tutorial)

Preserving privacy of users is a key requirement of web-scale data mining applications and systems such as web search, recommender systems, crowdsourced platforms, and analytics applications, and has witnessed a renewed focus in light of recent data breaches and new regulations such as GDPR. In this tutorial, we will first present an overview of privacy breaches over the last two decades and the lessons learned, key regulations and laws, and evolution of privacy techniques leading to differential privacy definition / techniques. Then, we will focus on the application of privacy-preserving data mining techniques in practice, by presenting case studies such as Apple's differential privacy deployment for iOS / macOS, Google's RAPPOR, LinkedIn Salary, and Microsoft's differential privacy deployment for collecting Windows telemetry. We will conclude with open problems and challenges for the data mining / machine learning community, based on our experiences in industry.

Text analysis-semantic-search

Online text data for machine learning, data science, and research - Who can p...

Fredrik Olsson

In February of 2012, Google began launching the Panda Update (bears), the first of many steps away from a link-based model of relevance to a user experience model of relevance. This bearish focus on relevance use algorithms to determine a positive user experience focused on click-through (does the user select the result), bounce rate (does the user take action once they arrive at the landing page) and conversion (does the landing page satisfy the user’s information need). Content and information design became the foundation for relevance. Sadly, no one at Google told the content strategists, user experience professionals and information architects about their new influence on search engine performance. In April of 2012, Google followed up with the Penguin update (birds), a direct assault on link building, a mainstay of traditional search engine optimization (SEO). The Penguin algorithm evaluates the context and quality of links pointing to a site. Website found to be “over optimized” with low quality links are removed from Google’s index. Matt Cutts, GOogle Webmaster and the public face of Google, summed this up best: “And so that’s the sort of thing where we try to make the web site, uh Google Bot smarter, we try to make our relevance more adaptive so that people don’t do SEO, we handle that...” Sadly, Google is short on detail about how they are handling SEO, what constitutes adaptive relevance and how user experience professionals, information architects and content strategists can contribute thought-processing biped wisdom to computational algorithmic adaptive relevance so that searchers find what they are looking for even when they do not know that that is. This presentation will provide a brief introduction to the inner workings of information retrieval, the foundation of all search engines, even Google. On this foundation, I will dive deep into the Bs of how to optimize Web sites for today’s search technology: Be focused, Be authoritative, Be contextual and Be engaging. Birds (Penguin), Bears (Panda) & Bees: Optimal SEO will provide insight into recent search engine changes, proscriptive optimization guidance for usability and content strategy and foresight into the future direction of search.

Researching Social Media – Big Data and Social Media Analysis

Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)

Social Media Analysis: Present and Future

matthewhurst

Social media engagement

Matching Mobile Applications for Cross Promotion

Ethics in Data Science and Machine Learning

HJ van Veen

Social Media Forensics for Investigators

Case IQ

With 1.2 billion monthly active users on Facebook alone, it’s not surprising that social media networks can be a rich source of information for investigators. And because Americans spend more time on social media than any other major Internet activity, including email, social media information and evidence is plentiful. You just need to know how to get it. Finding, preserving and collecting social media evidence often requires some forensic skills, as well as an understanding of the laws that govern its collection and use. It’s important for investigators to be aware of both the possibilities and limitations of social media forensics.

Frontiers of Computational Journalism week 3 - Information Filter Design

disinformation risk management: leveraging cyber security best practices to s...

Sara-Jayne Terp

Smashing silos ia-ux-meetup-mar112014

Marianne Sweeny

While we have been busy trying to "define the damn thing" IA or answering the age old question of who rules, UX, IxDA or IA, the search engines have been busily transitioning to a machine mediated experience model for ranking. This means that SEO is now the responsibility of UX/IA whether we like it or not. This presentation lays out how search engines evaluate user experience and how we can influence this evaluation with an optimized design.

Creating a Data-Driven Government: Big Data With Purpose

Tyrone Grandison

The U.S. Department of Commerce collects, processes and disseminates data on a range of issues that impact our nation. Whether it's data on the economy, the environment, or technology, data is critical in fulfilling the Department's mission of creating the conditions for economic growth and opportunity. It is this data that provides insight, drives innovation, and transforms our lives. The U.S. Department of Commerce has become known as "America's Data Agency" due to the tens of thousands of datasets including satellite imagery, material standards and demographic surveys. But having a host of data and ensuring that this data is open and accessible to all are two separate issues. The latter, expanding open data access, is now a key pillar of the Commerce Department's mission. It was this focus on enhancing open data that led to the creation of the Commerce Data Service (CDS). The mission at the Commerce Data Service is to enable more people to use big data from across the department in innovative ways and across multiple fields. In this talk, I will explore how we are using big data to create a data-driven government. This talk is a keynote given at the Texas tech University's Big Data Symposium.

Advanced Keyword Research SMX Toronto March 2013BrightEdge

Sentiment Analysis and Social Media: How and WhyDavide Feltoni Gurini

Designing Cybersecurity Policies with Field Experiments

Fairness, Transparency, and Privacy in AI @ LinkedIn

How do we protect privacy of users in large-scale systems? How do we ensure fairness and transparency when developing machine learned models? With the ongoing explosive growth of AI/ML models and systems, these are some of the ethical and legal challenges encountered by researchers and practitioners alike. In this talk (presented at QConSF 2018), we first present an overview of privacy breaches as well as algorithmic bias / discrimination issues observed in the Internet industry over the last few years and the lessons learned, key regulations and laws, and evolution of techniques for achieving privacy and fairness in data-driven systems. We motivate the need for adopting a "privacy and fairness by design" approach when developing data-driven AI/ML models and systems for different consumer and enterprise applications. We also focus on the application of privacy-preserving data mining and fairness-aware machine learning techniques in practice, by presenting case studies spanning different LinkedIn applications, and conclude with the key takeaways and open challenges.

Frontiers of Computational Journalism week 2 - Text Analysis

Myths and challenges in knowledge extraction and analysis from human-generate...

Marco Brambilla

For centuries, science (in German "Wissenschaft") has aimed to create ("schaften") new knowledge ("Wissen") from the observation of physical phenomena, their modelling, and empirical validation. Recently, a new source of knowledge has emerged: not (only) the physical world any more, but the virtual world, namely the Web with its ever-growing stream of data materialized in the form of social network chattering, content produced on demand by crowds of people, messages exchanged among interlinked devices in the Internet of Things. The knowledge we may find there can be dispersed, informal, contradicting, unsubstantiated and ephemeral today, while already tomorrow it may be commonly accepted. The challenge is once again to capture and create knowledge that is new, has not been formalized yet in existing knowledge bases, and is buried inside a big, moving target (the live stream of online data). The myth is that existing tools (spanning fields like semantic web, machine learning, statistics, NLP, and so on) suffice to the objective. While this may still be far from true, some existing approaches are actually addressing the problem and provide preliminary insights into the possibilities that successful attempts may lead to. The talk explores the mixed realistic-utopian domain of knowledge extraction and reports on some tools and cases where digital and physical world have brought together for better understanding our society.

Getting Started in Data Science

What's hot

The language of social media

Adding value to NLP: a little semantics goes a long way

Using language to save the world: interactions between society, behaviour and...

Birds Bears and Bs:Optimal SEO for Today's Search Engines

Marianne Sweeny

Researching Social Media – Big Data and Social Media Analysis

Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)

Social Media Analysis: Present and Future

matthewhurst

Social media engagement

Matching Mobile Applications for Cross Promotion

Ethics in Data Science and Machine Learning

HJ van Veen

Social Media Forensics for Investigators

Case IQ

Frontiers of Computational Journalism week 3 - Information Filter Design

disinformation risk management: leveraging cyber security best practices to s...

Sara-Jayne Terp

Smashing silos ia-ux-meetup-mar112014

Marianne Sweeny

Creating a Data-Driven Government: Big Data With Purpose

Tyrone Grandison

Advanced Keyword Research SMX Toronto March 2013BrightEdge

Sentiment Analysis and Social Media: How and WhyDavide Feltoni Gurini

Designing Cybersecurity Policies with Field Experiments

Fairness, Transparency, and Privacy in AI @ LinkedIn

Frontiers of Computational Journalism week 2 - Text Analysis

What's hot (20)

The language of social media

Adding value to NLP: a little semantics goes a long way

Using language to save the world: interactions between society, behaviour and...

Birds Bears and Bs:Optimal SEO for Today's Search Engines

Researching Social Media – Big Data and Social Media Analysis

Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)

Social Media Analysis: Present and Future

Social media engagement

Matching Mobile Applications for Cross Promotion

Ethics in Data Science and Machine Learning

Social Media Forensics for Investigators

Frontiers of Computational Journalism week 3 - Information Filter Design

disinformation risk management: leveraging cyber security best practices to s...

Smashing silos ia-ux-meetup-mar112014

Creating a Data-Driven Government: Big Data With Purpose

Advanced Keyword Research SMX Toronto March 2013

Sentiment Analysis and Social Media: How and Why

Designing Cybersecurity Policies with Field Experiments

Fairness, Transparency, and Privacy in AI @ LinkedIn

Frontiers of Computational Journalism week 2 - Text Analysis

Similar to Layman's Talk: Entities of Interest --- Discovery in Digital Traces

Myths and challenges in knowledge extraction and analysis from human-generate...

Marco Brambilla

Getting Started in Data Science

London data and digital masterclass for councillors slides 14-Feb-20

LG Inform Plus

Getting started in Data Science (April 2017, Los Angeles)

Career in Data Science (July 2017, DTLA)