This document presents a two-stage approach to predict discussion activity on online community forums. In the first stage, the approach identifies "seed posts" that are likely to generate replies through a classification model using user, content, and topic focus features. The second stage predicts the level of discussion generated by seed posts using regression models. Key findings include that posts with many URLs may reduce activity, while lower forum entropy and more complex language can increase activity. The approach achieved good performance in identifying seed posts and predicting discussion levels. Future work aims to apply the approach to other social media platforms.
Il laboratorio aperto: limiti e possibilità dell’uso di Facebook, Twitter e Y...Manolo Farci
Intervento di Davide Bennato, Fabio Giglietto, Luca Rossi tenuto durante il convegno "Così vicini, così lontani: la via italiana aia social network" (26-27 Settembre Milano)
Il laboratorio aperto: limiti e possibilità dell’uso di Facebook, Twitter e Y...Manolo Farci
Intervento di Davide Bennato, Fabio Giglietto, Luca Rossi tenuto durante il convegno "Così vicini, così lontani: la via italiana aia social network" (26-27 Settembre Milano)
IEEE International Conference on Social Computing, Boston, USA
(http://www.iisocialcom.org/conference/socialcom2011/)
In this event, the OU team presented their work for anticipating
discussion activity on community forums. This work tried to address
two main research questions: which features are key to stimulating
discussions? And, how do these features influence discussion length?
This analysis offers policy makers the opportunity to focus on posts
that are bound to generate a higher attention from the public.
H. Purohit, Y. Ruan, A. Joshi, S. Parthasarathy, A. Sheth. Understanding User-Community Engagement by Multi-faceted Features: A Case Study on Twitter. in SoME 2011 (Workshop on Social Media Engagement, in conjunction with WWW 2011), March 29, 2011.
Paper: http://knoesis.org/library/resource.php?id=1095
More on Social Media @ Kno.e.sis at http://knoesis.org/research/semweb/projects/socialmedia/
WEB 2.0 FOR FORESIGHT: EXPERIENCES ON AN INNOVATION PLATFORM IN EUROPEAN AGEN...Totti Könnölä
While the private sector has already discovered the wide set of benefits of web 2.0 technologies (McKinsey, 2009), the public sector is only beginning to use these tools. Especially the use of interactive and collaborative tools in FTA for priority setting has been rather limited until today. Examples in both a public and private sector environment suggest great potential for web 2.0 foresight in public organisations and policy-making, both in terms of advancing foresight methodologies and in terms of increasing transparency. This paper develops a framework for designing a web 2.0 foresight exercise, building on the For-Learn Foresight Cycle, experiences from other disciplines such as market research with web 2.0 research, and hands-on project experience from JRC-IPTS. It applies the framework to the design and implementation of a foresight case of the European Institute of Innovation and Technology (EIT), where a web 2.0 ideation platform was used to collect ideas from research and development communities across the globe for world leading innovation that integrate education, business and research with a specific thematic focus. It is concluded that key elements in the design are clarity about process and outcome objectives, a systematic approach to tool selection, the organisation of a pilot before the launch, a clear view on sense-making from the data collected, and a certain degree of autonomy in the management of the foresight process.
How are project-specific forums utilized? A study of participation, content, ...Yusuf Sulistyo Nugroho
This presentation slide describes the detailed findings of the analyses on how project-specific forums in the Eclipse ecosystem are utilized in terms of participation, content, and sentiment. This is presented in the Journal-First Track of The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) 2022.
Related video: https://youtu.be/CuG00prD_jo
Operationalisation of Collaboration Sunbelt 2015Dawn Foster
The operationalisation of collaboration: in search of a definition and its consequences on
analysis
Collaboration has been defined in numerous ways. Researchers interested in collaboration at the
individual or organizational level need to pay special attention to the adoption of a specific definition, as
this is likely to have major implications for the research design and outcomes. With respect to
collaboration within open source software projects, this presentation has two objectives. Firstly, this
presentation will investigate a wide variety of definitions of collaboration from the existing literature.
Secondly, the presentation will look at theoretically informed selection of a definition. Throughout the
presentation, specific emphasis will be put on the implications of adoption of several definitions of
collaboration for the application of Social Network Analysis to the study of open source software,
particularly considering data collection and analysis. Open source software is developed in the open
where anyone can view the source code and anyone with the knowledge to do so can contribute to the
project. Because people from around the world work on these projects together using online tools, it is
a relevant setting for studying collaboration. An interesting aspect of open source collaboration is that
private resources from individuals and organizations are used to develop software that is released as a
public good. Social Network Analysis can be used to understand the network relationships between the
individuals who develop this software. Given the interest in collaboration from researchers from different
backgrounds and disciplines, similar research is likely to produce considerations to stimulate further
thoughts about definitions of collaboration in several domains and research settings.
This talk seeks to introduce the CHAOSS -Community Health Analytics for Open Source Projects- to the InnerSource practitioners. Metrics and KPIs are of importance for the InnerSource Commons as seen in the several talks and discussions during the last summits.
CHAOSS was born in 2017 under the Linux Foundation umbrella and this is a hub of OSS projects and organizations participating in the definition of metrics of interest for Open Source projects.
As active members of CHAOSS, there are some lessons learnt when dealing with those metrics and KPIs, either at the theoretical definition and from a software implementation that might be worth exploring across the InnerSource ecosystem.
This talk will bring this discussion and how to potentially build bridges between both
Week 6 slides from the class "Social Web 2.0" I taught at the University of Washington's Masters in Communication program in 2007. Most of the content is still very relevant today. Topics: Lightweight authoring, blogs, and wikis
IEEE International Conference on Social Computing, Boston, USA
(http://www.iisocialcom.org/conference/socialcom2011/)
In this event, the OU team presented their work for anticipating
discussion activity on community forums. This work tried to address
two main research questions: which features are key to stimulating
discussions? And, how do these features influence discussion length?
This analysis offers policy makers the opportunity to focus on posts
that are bound to generate a higher attention from the public.
H. Purohit, Y. Ruan, A. Joshi, S. Parthasarathy, A. Sheth. Understanding User-Community Engagement by Multi-faceted Features: A Case Study on Twitter. in SoME 2011 (Workshop on Social Media Engagement, in conjunction with WWW 2011), March 29, 2011.
Paper: http://knoesis.org/library/resource.php?id=1095
More on Social Media @ Kno.e.sis at http://knoesis.org/research/semweb/projects/socialmedia/
WEB 2.0 FOR FORESIGHT: EXPERIENCES ON AN INNOVATION PLATFORM IN EUROPEAN AGEN...Totti Könnölä
While the private sector has already discovered the wide set of benefits of web 2.0 technologies (McKinsey, 2009), the public sector is only beginning to use these tools. Especially the use of interactive and collaborative tools in FTA for priority setting has been rather limited until today. Examples in both a public and private sector environment suggest great potential for web 2.0 foresight in public organisations and policy-making, both in terms of advancing foresight methodologies and in terms of increasing transparency. This paper develops a framework for designing a web 2.0 foresight exercise, building on the For-Learn Foresight Cycle, experiences from other disciplines such as market research with web 2.0 research, and hands-on project experience from JRC-IPTS. It applies the framework to the design and implementation of a foresight case of the European Institute of Innovation and Technology (EIT), where a web 2.0 ideation platform was used to collect ideas from research and development communities across the globe for world leading innovation that integrate education, business and research with a specific thematic focus. It is concluded that key elements in the design are clarity about process and outcome objectives, a systematic approach to tool selection, the organisation of a pilot before the launch, a clear view on sense-making from the data collected, and a certain degree of autonomy in the management of the foresight process.
How are project-specific forums utilized? A study of participation, content, ...Yusuf Sulistyo Nugroho
This presentation slide describes the detailed findings of the analyses on how project-specific forums in the Eclipse ecosystem are utilized in terms of participation, content, and sentiment. This is presented in the Journal-First Track of The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) 2022.
Related video: https://youtu.be/CuG00prD_jo
Operationalisation of Collaboration Sunbelt 2015Dawn Foster
The operationalisation of collaboration: in search of a definition and its consequences on
analysis
Collaboration has been defined in numerous ways. Researchers interested in collaboration at the
individual or organizational level need to pay special attention to the adoption of a specific definition, as
this is likely to have major implications for the research design and outcomes. With respect to
collaboration within open source software projects, this presentation has two objectives. Firstly, this
presentation will investigate a wide variety of definitions of collaboration from the existing literature.
Secondly, the presentation will look at theoretically informed selection of a definition. Throughout the
presentation, specific emphasis will be put on the implications of adoption of several definitions of
collaboration for the application of Social Network Analysis to the study of open source software,
particularly considering data collection and analysis. Open source software is developed in the open
where anyone can view the source code and anyone with the knowledge to do so can contribute to the
project. Because people from around the world work on these projects together using online tools, it is
a relevant setting for studying collaboration. An interesting aspect of open source collaboration is that
private resources from individuals and organizations are used to develop software that is released as a
public good. Social Network Analysis can be used to understand the network relationships between the
individuals who develop this software. Given the interest in collaboration from researchers from different
backgrounds and disciplines, similar research is likely to produce considerations to stimulate further
thoughts about definitions of collaboration in several domains and research settings.
This talk seeks to introduce the CHAOSS -Community Health Analytics for Open Source Projects- to the InnerSource practitioners. Metrics and KPIs are of importance for the InnerSource Commons as seen in the several talks and discussions during the last summits.
CHAOSS was born in 2017 under the Linux Foundation umbrella and this is a hub of OSS projects and organizations participating in the definition of metrics of interest for Open Source projects.
As active members of CHAOSS, there are some lessons learnt when dealing with those metrics and KPIs, either at the theoretical definition and from a software implementation that might be worth exploring across the InnerSource ecosystem.
This talk will bring this discussion and how to potentially build bridges between both
Week 6 slides from the class "Social Web 2.0" I taught at the University of Washington's Masters in Communication program in 2007. Most of the content is still very relevant today. Topics: Lightweight authoring, blogs, and wikis
From User Needs to Community Health: Mining User Behaviour to Analyse Online ...Matthew Rowe
Invited keynote talk at the 1st Workshop of Quality, Motivation and Coordination of Open Collaboration @ the International Conference on Social Informatics 2013
Attention Economics in Social Web SystemsMatthew Rowe
Slides from a Highwire Digital Futures Seminar that I gave at Lancaster University on 25th October 2012 covering Attention Economics in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web SystemsMatthew Rowe
Presented at:
-Aston Business School, Birmingham, UK. 2011
-Keynote presentation at Detecting and Exploiting Cultural Diversity on the Social Web Workshop, 20th Annual Conference on Information and Knowledge Management 2011
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
When stars align: studies in data quality, knowledge graphs, and machine lear...
Anticipating Discussion Activity on Community Forums
1. Anticipating Discussion Activity on Community Forums Matthew Rowe, Sofia Angeletou and HarithAlani Knowledge Media Institute, The Open University, Milton Keynes, United Kingdom The Third IEEE International Conference on Social Computing. MIT, Boston, USA. 2011
2. Community Content 1 Anticipating Discussion Activity on Community Forums Online communities are now used to: Ask questions Post opinions and ideas Discuss events and current issues Content analysis in online communities is attractive for: Market analysis Brand consensus and product opinion Social network analytics in the US is predicted to reach $1 billion by 2014 (Forrester 2009) Masses of data is now being published in online communities: Facebook has more than 60 million status updates per day (Facebook statistics 2010)
4. The Need for Analysis Analysts need to know which piece of content will generate the most activity i.e. the most auspicious or influential Helps focus the attention of human and computerised analysts What to track? Need to understand the effect features (community and content) have on attention to content Enable content creators to shape their content in order to maximise impact E.g. promoters, government policy makers RQ1: Which features are key to stimulating discussions? RQ2: How do these features influence discussion length? Anticipating Discussion Activity on Community Forums 3
5. Outline Anticipating Discussion Activity: Approach Overview Identifying Seed Posts Predicting Discussion Activity Features Dataset Community Message Board: Boards.ie 1. Identifying Seed Posts 2. Predicting Discussion Activity Findings Conclusions Anticipating Discussion Activity on Community Forums 4
6. Approach Overview Two-stage approach to predict discussion activity in online communities: 1. Identify seed posts i.e. Thread starters that yield a reply Will a given post start a discussion? What are the properties that seed posts exhibit? What parameters tend to trigger a discussion? 2. Predict discussion activity levels From the identified seed posts What is the level of discussion that a seed post will generate? What features correlate with heightened discussion activity? Anticipating Discussion Activity on Community Forums 5
7. Features For each post, model: a) the author, b) the content and c) the topical concentration of the author F1: User Features In-degree, out-degree: social network properties of the author Post count, age, post rate: participation information of the author F2: Content Features Post length, referral count, time in day: surface features of the post Complexity: cumulative entropy of terms in the post Readability: Gunning Fog index of the post Informativeness: TF-IDF measure of terms within the post Polarity: average sentiment of terms in the post Anticipating Discussion Activity on Community Forums 6
8. Features (2) F3: Focus Features Topic entropy: the concentration of the author across community forums Higher entropy indicates a wider spread of forum activity More random distribution, less concentrated Topic Likelihood: the likelihood that a user posts in a specific forum given his post history Measures the affinity that a user has with a given forum Lower likelihood indicates a user posting on an unfamiliar topic Anticipating Discussion Activity on Community Forums 7
9. Dataset: Boards.ie Irish community message board that was established in 1998 Covers a wide array of topics and themes in forums E.g. World of Warcraft, Japanese Culture, Rugby We were provided with the complete dataset spanning 1998-2008 of all posts and forum information Focussed on 2006 due to the scale of entire dataset No explicit social connections exist in the dataset Social network features were built from the reply-to graph 6-month window prior to the post date was used to build the user and focus features Anticipating Discussion Activity on Community Forums 8
10. 1. Identifying Seed Posts Will a given post start a discussion? What are the properties that seed posts exhibit? Experiment Setup: Used all thread starter posts from Boards.ie in 2006 Training/validation/testing sets using a 70/20/10% random split Binary classification task: Is this a seed post or not? Measures: precision, recall, f-measure, area under ROC curve Performed 2 experiments: a) Model Selection Tested individual feature sets (user, content, focus) and combinations b) Feature Assessment Dropping 1 feature at a time, record reduction in f-measure Anticipating Discussion Activity on Community Forums 9
14. 2. Predicting Discussion Activity What is the level of discussion that a seed post will generate? What features correlate with heightened discussion activity? Experiment Setup: Train: seed posts in 70% training split Test: seed posts in 20% validation split Measure: Normalised Discounted Cumulative Gain (nDCG) Look at varying rank positions: nDCG@k, k=1,2,5,10,20,50,100 Performed 2 experiments a) Model Selection Regression models: Linear, Isotonic, Support Vector Regression Tested individual feature sets (user, content, focus) and combinations b) Feature Contributions Assess the features in the best performing model from a) Anticipating Discussion Activity on Community Forums 13
16. 2.a) Model Selection Anticipating Discussion Activity on Community Forums 15 Support Vector Regression Isotonic Linear
17. 2.b) Feature Contributions What features correlate with heightened discussion activity? Anticipating Discussion Activity on Community Forums 16
18.
19. Negative sentiment posts generate more activityAnticipating Discussion Activity on Community Forums 17
20. Conclusions and Future Work The two-stage approach is able to: Identify seed posts to a high degree of accuracy F-measure: 0.792 Predict discussion activity levels nDCG@1: 0.89 (linear regression model) Content and focus features yield best performing model Average nDCG@k: 0.756 Findings inform: Market Analysts to track high activity posts from the outset Content creators to shape content in order to maximise impact Currently applying approach over different platforms: How can we predict activity on a given social web system? How do social web systems differ in generate activity? Anticipating Discussion Activity on Community Forums 18
21. Anticipating Discussion Activity on Community Forums 19 Questions? Web: http://people.kmi.open.ac.uk/rowe Email: m.c.rowe@open.ac.uk Twitter: @mattroweshow
Editor's Notes
80% to 20% skew towards seeds from non-seeds
Content features outperform user featuresContent and focus outperforms other feature combinationsAll feature together works bestDiffers from Twitter analysis – user features were better predictors than content features
Trained J48 with all features using the training splitTested it on the held-out 10%Dropped1 feature at a time from the model and classified the test splitLooking for features that have greatest reduction in accuracy
Boxplots show:Higher referral counts correlate with non-seedsSpamHigher forum likelihood correlates with seedsUsers who concentrate their discussions within select forums will start a discussion – as they’re known to the communityHigher informativeness correlated with non-seeds
Solitary features:User features perform best as the solitary feature sets for Linear regression and SVRFocus features best for Isotonic regressionCombinedContent and focus perform best for Linear Isotonic
Smallest SD for content and focus features
A user can expect increased discussion activity if he/she hasLow forum entropyHigh forum likelihoodIs negative in his/her posts Uses complex language (wide vocab – i.e. articulate)