Talk given August 29, 2018 at the 1st Biannual Conference on Design of Experimental Search & Information Retrieval Systems (DESIRES 2018). Paper: https://www.ischool.utexas.edu/~ml/papers/lease-desires18.pdf
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Who’s in the Gang? Revealing Coordinating Communities in Social MediaDerek Weber
Political astroturfing and organised trolling are online malicious behaviours with significant real-world effects. Common approaches examining these phenomena focus on broad campaigns rather than the small groups responsible. To reveal networks of cooperating accounts, we propose a novel temporal window approach that relies on account interactions and metadata alone. It detects groups of accounts engaging in behaviours that, in concert, execute different goal-based strategies, which we describe. Our approach is validated against two relevant datasets with ground truth data. See https://github.com/weberdc/find_hccs for code and data.
Presented at ASONAM'20 (2020 IEEE/ACM International Conference on Advances in Social Network Analysis and Mining).
Co-authored with Frank Neumann (University of Adelaide)
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Matthew Lease
Presented at the 31st ACM User Interface Software and Technology Symposium (UIST), 2018. Paper: https://www.ischool.utexas.edu/~ml/papers/nguyen-uist18.pdf
Talk given August 29, 2018 at the 1st Biannual Conference on Design of Experimental Search & Information Retrieval Systems (DESIRES 2018). Paper: https://www.ischool.utexas.edu/~ml/papers/lease-desires18.pdf
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Who’s in the Gang? Revealing Coordinating Communities in Social MediaDerek Weber
Political astroturfing and organised trolling are online malicious behaviours with significant real-world effects. Common approaches examining these phenomena focus on broad campaigns rather than the small groups responsible. To reveal networks of cooperating accounts, we propose a novel temporal window approach that relies on account interactions and metadata alone. It detects groups of accounts engaging in behaviours that, in concert, execute different goal-based strategies, which we describe. Our approach is validated against two relevant datasets with ground truth data. See https://github.com/weberdc/find_hccs for code and data.
Presented at ASONAM'20 (2020 IEEE/ACM International Conference on Advances in Social Network Analysis and Mining).
Co-authored with Frank Neumann (University of Adelaide)
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Matthew Lease
Presented at the 31st ACM User Interface Software and Technology Symposium (UIST), 2018. Paper: https://www.ischool.utexas.edu/~ml/papers/nguyen-uist18.pdf
Explainable Fact Checking with Humans in-the-loopMatthew Lease
Invited Keynote at KDD 2021 TrueFact Workshop: Making a Credible Web for Tomorrow, August 15, 2021.
https://www.microsoft.com/en-us/research/event/kdd-2021-truefact-workshop-making-a-credible-web-for-tomorrow/#!program-schedule
Data Journalism and the Remaking of Data InfrastructuresLiliana Bounegru
Talk given at the “Evidence and the Politics of Policymaking” Conference, University of Bath, 14th September 2016, on the basis of my PhD research at the University of Groningen and University of Ghent.
http://www.bath.ac.uk/ipr/events/news-0230.html.
Social Media Mining - Chapter 10 (Behavior Analytics)SocialMediaMining
R. Zafarani, M. A. Abbasi, and H. Liu, Social Media Mining: An Introduction, Cambridge University Press, 2014.
Free book and slides at http://socialmediamining.info/
Ethics in Data Science and Machine LearningHJ van Veen
Introduction and overview on ethics in data science and machine learning, variations and examples of algorithmic bias, and a call-to-action for self-regulation. Given by Thierry Silbermann as part of the Sao Paulo Machine Learning Meetup, theme: "Ethics".
https://www.linkedin.com/in/thierrysilbermann
https://twitter.com/silbermannt
https://github.com/thierry-silbermann
30 Tools and Tips to Speed Up Your Digital Workflow Mike Kujawski
Have you ever found yourself wasting a considerable amount of time performing some annoying, repetitive process within a common application, social media website, or your web browser? Wish there was a "magic" shortcut or simply a better way of getting it done? There most likely is.
While having a solid strategy should always be the first priority before engaging in the digital/social media space, it's also smart to arm yourself with a set of tools that will help you with the tactical implementation of your plan. These presentation slides provide 30 tools and tips hand-picked from Mike Kujawski's personal experience, day-to-day observations, and interactions with his consulting and training clients.
These tools are meant to help you be more efficient and effective as a communicator in today's digital world, where agility and "life-hacking" skills are becoming increasingly valued.
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT ToolsMike Kujawski
These are my slides from a custom tool-based demonstration workshop I was asked to do where I went over various free tools that can be used to obtain valuable public data.
Creating a Data-Driven Government: Big Data With PurposeTyrone Grandison
The U.S. Department of Commerce collects, processes and disseminates data on a range of issues that impact our nation. Whether it's data on the economy, the environment, or technology, data is critical in fulfilling the Department's mission of creating the conditions for economic growth and opportunity. It is this data that provides insight, drives innovation, and transforms our lives. The U.S. Department of Commerce has become known as "America's Data Agency" due to the tens of thousands of datasets including satellite imagery, material standards and demographic surveys.
But having a host of data and ensuring that this data is open and accessible to all are two separate issues. The latter, expanding open data access, is now a key pillar of the Commerce Department's mission. It was this focus on enhancing open data that led to the creation of the Commerce Data Service (CDS).
The mission at the Commerce Data Service is to enable more people to use big data from across the department in innovative ways and across multiple fields. In this talk, I will explore how we are using big data to create a data-driven government.
This talk is a keynote given at the Texas tech University's Big Data Symposium.
We are providing training on IEEE 2016-17 projects for Ph.D Scalars, M.Tech, B.E, MCA, BCA and Diploma students for
all branches for their academic projects.
For more details call us or watsapp us @ 7676768124 0r 9545252155
Email your base papers to "adritsolutions@gmail.co.in"
We are providing IEEE projects on
1) Cloud Computing, Data Mining, BigData Projects Using JAva
2) Image Processing and Video Procesing (MATLAB) , Signal Processing
3) NS2 (Wireless Sensor, MANET, VANET)
4) ANDRIOD APPS
5) JAVA, JEE, J2EE, J2ME
6) Mechanical Design projects
7) Embedded Systems and IoT Projects
8) VLSI- Verilog Projects (ModelSim and Xilinx using FPGA)
For More details Please Visit us at
Adrit Solutions
Near Maruthi Mandir
#42/5, 18th Cross, 21st Main
Vijaynagar
Bangalore.
How to social scientists use link data (11 june2010)Han Woo PARK
The author would like to thank Bernie Horgan, Rob Ackland, Jeong-Soo Seo, and Yeon-ok Lee for their helpful comments on an earlier draft. Part of this research was carried out during the author’s stay at the Oxford Internet Institute. During the preparation of final manuscript, this research is supported from the WCU project granted from South Korean Government. This paper has been presented at the 2010 International Communication Association conference held in Singapore. http://www.icahdq.org/conferences/2010/
In order to explore public attitudes towards the use of data from online services (e.g. social media) or digital devices (e.g. mobile phone GPS), we are running a Twitter based campaign (#AnalyzeMyData) in which we reminded people of instances of data usage that have been reported in news stories and asked them to rate if they considered these data uses to be OK. In order to produce momentum of public participation we designed the experiment as a sustained campaign in which a different news item is presented each day over a period of multiple weeks. Each Tweet includes a link to a mini-survey which asks participants to respond, 'yes', 'no' or 'depends'. To further motivate continued participation as the campaign progresses, we provide a running update on our website of the response statistics to the items that were previously Tweeted. The types of data usage included in the campaign range from academic studies of social media use, to data collection for product development, marketing and government studies. Our hope is that this campaign/experiment will 1) help to raise awareness of the various ways in which personal data, acquired through online services of digital devices, is currently being used, and 2) provide a large dataset of case-studies with an associated baseline of public acceptance/rejection that can be used for future research ethics guidelines and review training.
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Saurabh Mishra
This group reviewed data and measurements indicating the positive potential of AI to serve Sustainable Development Goals (SDG’s). Alongside these optimistic inquiries, this group also investigated the risks of AI in areas such as privacy, vulnerable populations, human rights, workplace and organizational policy. The socio-political consequences of AI raise many complex questions which require continued rigorous examination.
Talk given at Delft University speaker series on "Crowd Computing & Human-Centered AI" (https://www.academicfringe.org/). November 23, 2020. Covers two 2020 works:
(1) Anubrata Das, Brandon Dang, and Matthew Lease. Fast, Accurate, and Healthier: Interactive Blurring Helps Moderators Reduce Exposure to Harmful Content. In Proceedings of the 8th AAAI Conference on Human Computation and Crowdsourcing (HCOMP), 2020.
Alexander Braylan and Matthew Lease. Modeling and Aggregation of Complex Annotations via Annotation Distances. In Proceedings of the Web Conference, pages 1807--1818, 2020.
Social Media Mining - Chapter 9 (Recommendation in Social Media)SocialMediaMining
R. Zafarani, M. A. Abbasi, and H. Liu, Social Media Mining: An Introduction, Cambridge University Press, 2014.
Free book and slides at http://socialmediamining.info/
ESRC Research Methods Festival - From Flickr to Snapchat: The challenge of an...Farida Vis
From Flickr to Snapchat: The challenge of analysing images on social media. Presentation part of the 'Challenges/Opportunities of Using Social Media for Social Science Research' panel. 9th of July 2014
Mobile Marketing Strategies for 2014 and BeyondFiksu
Apps are strategic touch points for content consumption, social activities, daily life -- and all the data continues to show a complete consumer transition to apps as how they interact on mobile devices.
87% of consumer time spent in mobile… is in apps
Games, news, productivity, utility, communications, social networking…
Just 13% of consumer time is spent in the mobile web
In browser, surfing the web, emulating the desktop experience.
Mobile Bootcamp - Building Mobile ApplicationsEnola Labs
Atomic Axis teaches you the process of building a mobile application. From discovery to design to development to deployment, you will be provided with a basis of information on exactly how to begin to turn your idea for a mobile application into a viable product.
Explainable Fact Checking with Humans in-the-loopMatthew Lease
Invited Keynote at KDD 2021 TrueFact Workshop: Making a Credible Web for Tomorrow, August 15, 2021.
https://www.microsoft.com/en-us/research/event/kdd-2021-truefact-workshop-making-a-credible-web-for-tomorrow/#!program-schedule
Data Journalism and the Remaking of Data InfrastructuresLiliana Bounegru
Talk given at the “Evidence and the Politics of Policymaking” Conference, University of Bath, 14th September 2016, on the basis of my PhD research at the University of Groningen and University of Ghent.
http://www.bath.ac.uk/ipr/events/news-0230.html.
Social Media Mining - Chapter 10 (Behavior Analytics)SocialMediaMining
R. Zafarani, M. A. Abbasi, and H. Liu, Social Media Mining: An Introduction, Cambridge University Press, 2014.
Free book and slides at http://socialmediamining.info/
Ethics in Data Science and Machine LearningHJ van Veen
Introduction and overview on ethics in data science and machine learning, variations and examples of algorithmic bias, and a call-to-action for self-regulation. Given by Thierry Silbermann as part of the Sao Paulo Machine Learning Meetup, theme: "Ethics".
https://www.linkedin.com/in/thierrysilbermann
https://twitter.com/silbermannt
https://github.com/thierry-silbermann
30 Tools and Tips to Speed Up Your Digital Workflow Mike Kujawski
Have you ever found yourself wasting a considerable amount of time performing some annoying, repetitive process within a common application, social media website, or your web browser? Wish there was a "magic" shortcut or simply a better way of getting it done? There most likely is.
While having a solid strategy should always be the first priority before engaging in the digital/social media space, it's also smart to arm yourself with a set of tools that will help you with the tactical implementation of your plan. These presentation slides provide 30 tools and tips hand-picked from Mike Kujawski's personal experience, day-to-day observations, and interactions with his consulting and training clients.
These tools are meant to help you be more efficient and effective as a communicator in today's digital world, where agility and "life-hacking" skills are becoming increasingly valued.
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT ToolsMike Kujawski
These are my slides from a custom tool-based demonstration workshop I was asked to do where I went over various free tools that can be used to obtain valuable public data.
Creating a Data-Driven Government: Big Data With PurposeTyrone Grandison
The U.S. Department of Commerce collects, processes and disseminates data on a range of issues that impact our nation. Whether it's data on the economy, the environment, or technology, data is critical in fulfilling the Department's mission of creating the conditions for economic growth and opportunity. It is this data that provides insight, drives innovation, and transforms our lives. The U.S. Department of Commerce has become known as "America's Data Agency" due to the tens of thousands of datasets including satellite imagery, material standards and demographic surveys.
But having a host of data and ensuring that this data is open and accessible to all are two separate issues. The latter, expanding open data access, is now a key pillar of the Commerce Department's mission. It was this focus on enhancing open data that led to the creation of the Commerce Data Service (CDS).
The mission at the Commerce Data Service is to enable more people to use big data from across the department in innovative ways and across multiple fields. In this talk, I will explore how we are using big data to create a data-driven government.
This talk is a keynote given at the Texas tech University's Big Data Symposium.
We are providing training on IEEE 2016-17 projects for Ph.D Scalars, M.Tech, B.E, MCA, BCA and Diploma students for
all branches for their academic projects.
For more details call us or watsapp us @ 7676768124 0r 9545252155
Email your base papers to "adritsolutions@gmail.co.in"
We are providing IEEE projects on
1) Cloud Computing, Data Mining, BigData Projects Using JAva
2) Image Processing and Video Procesing (MATLAB) , Signal Processing
3) NS2 (Wireless Sensor, MANET, VANET)
4) ANDRIOD APPS
5) JAVA, JEE, J2EE, J2ME
6) Mechanical Design projects
7) Embedded Systems and IoT Projects
8) VLSI- Verilog Projects (ModelSim and Xilinx using FPGA)
For More details Please Visit us at
Adrit Solutions
Near Maruthi Mandir
#42/5, 18th Cross, 21st Main
Vijaynagar
Bangalore.
How to social scientists use link data (11 june2010)Han Woo PARK
The author would like to thank Bernie Horgan, Rob Ackland, Jeong-Soo Seo, and Yeon-ok Lee for their helpful comments on an earlier draft. Part of this research was carried out during the author’s stay at the Oxford Internet Institute. During the preparation of final manuscript, this research is supported from the WCU project granted from South Korean Government. This paper has been presented at the 2010 International Communication Association conference held in Singapore. http://www.icahdq.org/conferences/2010/
In order to explore public attitudes towards the use of data from online services (e.g. social media) or digital devices (e.g. mobile phone GPS), we are running a Twitter based campaign (#AnalyzeMyData) in which we reminded people of instances of data usage that have been reported in news stories and asked them to rate if they considered these data uses to be OK. In order to produce momentum of public participation we designed the experiment as a sustained campaign in which a different news item is presented each day over a period of multiple weeks. Each Tweet includes a link to a mini-survey which asks participants to respond, 'yes', 'no' or 'depends'. To further motivate continued participation as the campaign progresses, we provide a running update on our website of the response statistics to the items that were previously Tweeted. The types of data usage included in the campaign range from academic studies of social media use, to data collection for product development, marketing and government studies. Our hope is that this campaign/experiment will 1) help to raise awareness of the various ways in which personal data, acquired through online services of digital devices, is currently being used, and 2) provide a large dataset of case-studies with an associated baseline of public acceptance/rejection that can be used for future research ethics guidelines and review training.
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Saurabh Mishra
This group reviewed data and measurements indicating the positive potential of AI to serve Sustainable Development Goals (SDG’s). Alongside these optimistic inquiries, this group also investigated the risks of AI in areas such as privacy, vulnerable populations, human rights, workplace and organizational policy. The socio-political consequences of AI raise many complex questions which require continued rigorous examination.
Talk given at Delft University speaker series on "Crowd Computing & Human-Centered AI" (https://www.academicfringe.org/). November 23, 2020. Covers two 2020 works:
(1) Anubrata Das, Brandon Dang, and Matthew Lease. Fast, Accurate, and Healthier: Interactive Blurring Helps Moderators Reduce Exposure to Harmful Content. In Proceedings of the 8th AAAI Conference on Human Computation and Crowdsourcing (HCOMP), 2020.
Alexander Braylan and Matthew Lease. Modeling and Aggregation of Complex Annotations via Annotation Distances. In Proceedings of the Web Conference, pages 1807--1818, 2020.
Social Media Mining - Chapter 9 (Recommendation in Social Media)SocialMediaMining
R. Zafarani, M. A. Abbasi, and H. Liu, Social Media Mining: An Introduction, Cambridge University Press, 2014.
Free book and slides at http://socialmediamining.info/
ESRC Research Methods Festival - From Flickr to Snapchat: The challenge of an...Farida Vis
From Flickr to Snapchat: The challenge of analysing images on social media. Presentation part of the 'Challenges/Opportunities of Using Social Media for Social Science Research' panel. 9th of July 2014
Mobile Marketing Strategies for 2014 and BeyondFiksu
Apps are strategic touch points for content consumption, social activities, daily life -- and all the data continues to show a complete consumer transition to apps as how they interact on mobile devices.
87% of consumer time spent in mobile… is in apps
Games, news, productivity, utility, communications, social networking…
Just 13% of consumer time is spent in the mobile web
In browser, surfing the web, emulating the desktop experience.
Mobile Bootcamp - Building Mobile ApplicationsEnola Labs
Atomic Axis teaches you the process of building a mobile application. From discovery to design to development to deployment, you will be provided with a basis of information on exactly how to begin to turn your idea for a mobile application into a viable product.
Brands are building up their mobile presence - comprised of apps, websites, and app stores - with the goal of interacting and engaging with consumers across every touch point. But why have relatively few brands effectively mastered the mobile channel? Find out in a report detailing survey findings of 1,000+ mobility influencers across the US and UK. We uncovered how much brands are investing in mobile projects, what their mobile priorities are and what frustrates them about mobilizing their businesses.
In this presentation, I tried to unleash every potential that mobile environment have. There are some "prediction" about future in mobile environment & apps too.
Safcodes is a leading Android and IOS app development company in Dubai that offers cutting-edge mobile application development services for Android and IOS devices.
Performics & ZenithOptimedia: Driving ROI through Smarter Use of MobilePerformics
Best practices for creating engaging mobile experiences (on-site (e.g. landing pages, apps) and off-site (e.g. search, social and display advertising)), and measuring the impact of mobile: across devices and digital-to-store from the performance marketing experts.
20x more time in apps than mobile web visitors spend in the mWeb
3x more conversion than mobile websites
55% of mobile commerce takes place in apps rather than websites
With increasingly more people owning smart phones year over year, the growth of mobile web usage is explosive globally and especially in the United States. Projections show a huge increase in consumers' use of mobile health websites and applications, thus the opportunity to engage mobile consumers is NOW. In this webinar we will discuss how to target and engage health consumers through mobile landing pages, applications and advertising.
The Challenges of a Rapidly Changing Local MarketplaceLocalogy
Local Search Association VP of Strategy & Insights Greg Sterling's remarks during the Search and Information Industry Association's (SIINDA) "Making Transactions Happen" event in Munich.
Developing A Big Data Analytics Framework for Industry IntelligenceGene Moo Lee
Researchers often model industry as a network where each node corresponds to an organization and an edge represents an inter-organizational relationship (e.g., competition, acquisition, alliance). Structural holes are an important construct in identifying network opportunity structures. While there have been significant theoretical and empirical works around this concept, there has been limited fine-grained empirical research on the operationalization of the structural hole concept based on organizational self-identified strategic posturing. In this project, we propose an innovative method to quantify self-identified strategic posturing structural holes using a machine learning approach called doc2vec, which transforms textual documents into numeric vector representations. Specifically, we apply the doc2vec model to the collection of 10-K annual reports from U.S. public firms in the 1995-2016 period. To show the effectiveness of our measure, we conducted empirical analyses on firm birth (i.e., IPO) and firm mortality (i.e., delisting) using Compustat data. First, our firm birth analysis, using the generalized linear model, shows that new organizations have an increasing birth rate in structural holes between a pair of existing firms. Second, using the Cox proportional hazard model, we show that organizations entering into a structural hole have a significant decrease in mortality rates. This is the first large-scale empirical study to use self-identified strategic posturing structural holes in the analysis of industry dynamics, and as such provides an advance to both the industry dynamics and network literature.
State of Artificial intelligence Report 2023kuntobimo2016
Artificial intelligence (AI) is a multidisciplinary field of science and engineering whose goal is to create intelligent machines.
We believe that AI will be a force multiplier on technological progress in our increasingly digital, data-driven world. This is because everything around us today, ranging from culture to consumer products, is a product of intelligence.
The State of AI Report is now in its sixth year. Consider this report as a compilation of the most interesting things we’ve seen with a goal of triggering an informed conversation about the state of AI and its implication for the future.
We consider the following key dimensions in our report:
Research: Technology breakthroughs and their capabilities.
Industry: Areas of commercial application for AI and its business impact.
Politics: Regulation of AI, its economic implications and the evolving geopolitics of AI.
Safety: Identifying and mitigating catastrophic risks that highly-capable future AI systems could pose to us.
Predictions: What we believe will happen in the next 12 months and a 2022 performance review to keep us honest.
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...sameer shah
"Join us for STATATHON, a dynamic 2-day event dedicated to exploring statistical knowledge and its real-world applications. From theory to practice, participants engage in intensive learning sessions, workshops, and challenges, fostering a deeper understanding of statistical methodologies and their significance in various fields."
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
1. Matching Mobile Applications
for Cross Promotion
Gene Moo Lee!
Ph.D. candidate
University of Texas at Austin
!
Joint work with Joowon Lee and Andrew B. Whinston
2. Success of mobile app markets
# Apps (K)
Big Data Marketing Analytics, Chicago, IL
Google Play
Apple App Store
Windows Phone
BlackBerry
0 200 400 600 800 1000 1200 1400 1600
3. Success and challenges
• App diversity is a key success factor !
• Too many apps: visibility/search issue!
• Developers: difficult to gain visibility!
• Users: hard to search for the right apps!
• Going towards a “winner-takes-all” market, not
a long-tail market [Petsas et al. 2013], [Zhong
and Michahelles 2013]
Big Data Marketing Analytics, Chicago, IL
3
4. Mobile app ad channels
Big Data Marketing Analytics, Chicago, IL
Mobile Display Ads
4
Cross Promotions
(Incentivized)
Social Network Ads
In this work, we study cross promotions!
• Incentivize app installs with rewards!
• It is a two-sided matching market
5. Our contributions
1. Evaluate ad effectiveness of cross promotion!
2. Model ad placements as matching problem!
3. Identify determinants of ad effectiveness:
Novel app similarity measure with machine
learning technique!
4. Design app matching mechanism
Big Data Marketing Analytics, Chicago, IL
5
6. Roadmap
• Data!
• Model!
• App Similarity!
• Empirical Analysis!
• Matching Mechanism
Big Data Marketing Analytics, Chicago, IL
6
7. IGAWorks data
• IGAWorks: Mobile ad platform company in Korea!
• 1011 cross promotions (Sept 2013 ~ May 2014)!
• 325 mobile apps (195K apps with meta info)!
• 1 million users
Big Data Marketing Analytics, Chicago, IL
7
8. Ad effectiveness
• We measure the ad effectiveness by the user engagement level: !
• Session duration!
• Number of app usages!
• Number of item purchases!
• Comparison by inflow channels: !
• Organic users, mobile display ads, cross promotions !
• Cross promotions: 10%, 1% best placements
Big Data Marketing Analytics, Chicago, IL
8
9. Ad effectiveness
• Bad news: free
riders don’t stay!
• Good news: good
matching can make
improvements!!
• Question: what
makes a good
match?
channel hour_avg
Big Data Marketing Analytics, Chicago, IL
9
20
15
10
5
0
total organic display cross cross_10% cross_1%
10. Model
• Source s: (established) app where ad is placed!
• Target t: (new) app to promote!
• Ad effectiveness of a match u(s, t) depends on !
• (1) Characteristics of source and target: X_s, X_t!
• (2) Similarity between source and target: P_s,t
Big Data Marketing Analytics, Chicago, IL
10
11. Individual app features
• Developer-given variables: !
• days_regist (age)!
• days_update (engagement)!
• file_size (functional complexity)!
• User-given variables: !
• num_rates (visibility)!
• avg_rate (user perceived app quality)!
• Both source and target
Big Data Marketing Analytics, Chicago, IL
11
12. App similarity
• Assumption: similar text descriptions -> similar apps!
• Our approach: Apply latent Dirichlet allocation [Blei et al. 2003]
algorithm on text descriptions of all apps!
• App market is described by “topics”!
• Each app is represented by a topic vector!
• Calculate Topic_Similarity with cosine similarity
Big Data Marketing Analytics, Chicago, IL
12
13. Topic models of app markets
• Input: 195,956 mobile apps in Korean market!
• Output: 100 topics (set of keywords related to a common theme)
Big Data Marketing Analytics, Chicago, IL
13
* Translated from Korean to English for readers
14. Empirical analysis
Big Data Marketing Analytics, Chicago, IL
14
Significantly positive:!
Topic_Similarity
Avg_Rate_Src,Tgt
File_Size_Tgt
Significantly negative:!
Days_Update_Tgt
Similar results with other
metrics
15. Matching in cross promotions
• Problem definition: Given S andT, ad platform should find
a stable matching m that maximizes expected utilities!
• Utility is transferable between s and t!
• Leverage our model to predict utility of a match (s, t)!
• One-to-one matching: one s promotes one t!
Big Data Marketing Analytics, Chicago, IL
• Stability: !
• No matched pairs prefer to deviate from the resultant
matching !
• [Gale and Shapley 1962], [Roth 1984], [Hatfield et al.
2013]
15
s_1
s_2
s_3
s_n
t_1
t_2
t_3
t_m
16. Matching mechanism
• Linear Programming from
[Gale 1960] and [Shapley and
Shubik 1972]!
• Stable matching!
• Price suggestion!
• Improved utility: !
• 260% (predicted)
Big Data Marketing Analytics, Chicago, IL
16
17. Future Directions
• Extending matching model!
• Many-to-many matching!
• Pricing mechanism!
• Empirical analysis!
• Randomized field experiments
Big Data Marketing Analytics, Chicago, IL
17
19. References
• Mobile app stats: http://en.wikipedia.org/wiki/List_of_mobile_software_distribution_platforms!
• Mobile app revenue by developers: http://techcrunch.com/2014/07/21/the-majority-of-todays-app-businesses-are-not-sustainable/ !
• Facebook Mobile App Ad: https://developers.facebook.com/products/ads/!
• Twitter mobile app promotion suite: https://blog.twitter.com/2014/a-new-way-to-promote-mobile-apps-to-1-billion-devices-both-on-and-
Big Data Marketing Analytics, Chicago, IL
off-twitter!
• IGAWorks: http://www.igaworks.com/en/ !
• Petsas T., Papadogiannakis A., Polychronakis M., Markatos E. P., and Karagiannis T. (2013). “Rise of the Planet of the Apps: A
Systematic Study of the Mobile App Ecosystem.” Proceedings of the Internet Measurement Conference, pp. 277-290.!
• Zhong N. and Michahelles F. (2013). "Google Play Is Not A Long Tail Market: An Empirical Analysis of App Adoption on the Google
Play App Market." Proceedings of the Annual ACM Symposium on Applied Computing: pp. 499-504.!
• Blei D. B., Ng A. Y., and Jordan M. I. (2003). “Latent Dirichlet Allocation.” Journal of Machine Learning Research, 3: pp. 993-1022.!
• Shapley L. and Shubik M. (1972). “The Assignment Game 1: The Core.” International Journal of Game Theory, 1: pp. 111-130.!
• Roth A. E. (1984). “The Evolution of the Labor Market for Medical Interns and Residents: A Case Study in Game Theory.” Journal
of Political Economy, 92(6): pp. 991–1016.!
• Hatfield J.W., Kominers S. D., Nichifor A., Ostrovsky M., andWestkamp A. (2013). “Stability and Competitive Equilibrium in Trading
Networks.” Journal of Political Economy, 12(5): pp. 966-1005.!
• Gale D. (1960). “The Theory of Linear Economic Models.” New York: McGraw-Hill.
21. Winner-takes-all market
Big Data Marketing Analytics, Chicago, IL
21
http://techcrunch.com/2014/07/21/the-majority-of-todays-app-businesses-are-not-sustainable/