This document outlines steps for analyzing social media text data using various tools like Crimson Hexagon, Python, and ConText:
1. Collect 10,000 tweets from @realDonaldTrump using Crimson Hexagon and export them as a CSV file.
2. Clean the tweet text by removing stopwords, punctuation, lowercase conversion, and stemming.
3. Perform analyses like term frequency, topic modeling, sentiment analysis and generate visualizations like word clouds.
4. The goal is to tell a story by extracting insights from Donald Trump's tweets.
Managing JSON Deliverables with Fuzzy String-Matching Logic and the Path ReaderSafe Software
When you find yourself with numerous geospatial files that need to be organized into JSON deliverables, you may be overwhelmed at first. This presentation will show you how you can use a path reader, some fuzzy string-matching logic, and how to templatize the JSON output. This greatly increases the efficiency of the task and makes what used to take hours of tedious work happen in minutes.
Managing JSON Deliverables with Fuzzy String-Matching Logic and the Path ReaderSafe Software
When you find yourself with numerous geospatial files that need to be organized into JSON deliverables, you may be overwhelmed at first. This presentation will show you how you can use a path reader, some fuzzy string-matching logic, and how to templatize the JSON output. This greatly increases the efficiency of the task and makes what used to take hours of tedious work happen in minutes.
This presentation is a part of the COP2271C college level course taught at the Florida Polytechnic University located in Lakeland Florida. The purpose of this course is to introduce Freshmen students to both the process of software development and to the Python language.
The course is one semester in length and meets for 2 hours twice a week. The Instructor is Dr. Jim Anderson.
A video of Dr. Anderson using these slides is available on YouTube at:
https://youtu.be/MamtCCdLnP4
What is reproducible research? Why should I use it? what tools should I use? This session will show you how to use scripts, version control and markdown to do better research.
Slides for the course Big Data and Automated Content Analysis, in which students of the social sciences (communication science) learn how to conduct analyses using Python. Fourth meeting.
This presentation is a part of the COP2271C college level course taught at the Florida Polytechnic University located in Lakeland Florida. The purpose of this course is to introduce Freshmen students to both the process of software development and to the Python language.
The course is one semester in length and meets for 2 hours twice a week. The Instructor is Dr. Jim Anderson.
A video of Dr. Anderson using these slides is available on YouTube at:
https://youtu.be/MamtCCdLnP4
What is reproducible research? Why should I use it? what tools should I use? This session will show you how to use scripts, version control and markdown to do better research.
Slides for the course Big Data and Automated Content Analysis, in which students of the social sciences (communication science) learn how to conduct analyses using Python. Fourth meeting.
Slides for the first meeting of the course 'Big Data and Automated Content Analysis' at the Department of Communication Science, University of Amsterdam
Interactive ad-hoc analysis at petabyte scale with HDInsight Interactive QueryAshish Thapliyal
Slides from my session in Microsoft BUILD conference
In this session, you will learn how technologies such as Low Latency Analytical Processing [LLAP] and Hive 2.x are making it possible to analyze petabytes of data with sub second latency with common file formats such as csv, json etc. without converting to columnar file formats like ORC/Parquet. We will go deep into LLAP’s performance and architecture benefits and how it compares with Spark and Presto. We also look at how business
analysts can use familiar tools such as Microsoft Excel and Power BI and do interactive query over their data lake without moving data outside the data lake.
Interactive ad-hoc analysis at petabyte scale with HDInsight Interactive QueryMicrosoft Tech Community
In this session, you will learn how technologies such as Low Latency Analytical Processing [LLAP] and Hive 2.x are making it possible to analyze petabytes of data with sub second latency with common file formats such as csv, json etc. without converting to columnar file formats like ORC/Parquet. We will go deep into LLAP’s performance and architecture benefits and how it compares with Spark and Presto. We also look at how business analysts can use familiar tools such as Microsoft Excel and Power BI and do interactive query over their data lake without moving data outside the data lake.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxEduSkills OECD
Andreas Schleicher presents at the OECD webinar ‘Digital devices in schools: detrimental distraction or secret to success?’ on 27 May 2024. The presentation was based on findings from PISA 2022 results and the webinar helped launch the PISA in Focus ‘Managing screen time: How to protect and equip students against distraction’ https://www.oecd-ilibrary.org/education/managing-screen-time_7c225af4-en and the OECD Education Policy Perspective ‘Students, digital devices and success’ can be found here - https://oe.cd/il/5yV
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
Ethnobotany and Ethnopharmacology:
Ethnobotany in herbal drug evaluation,
Impact of Ethnobotany in traditional medicine,
New development in herbals,
Bio-prospecting tools for drug discovery,
Role of Ethnopharmacology in drug evaluation,
Reverse Pharmacology.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
1. SMAC LAB, LSU
OCT 19, 2018
SMAC Talks
Telling Stories from Social Media Text 1
Instructor: Dr. Ke (Jenny) Jiang
2. Telling Stories from Social Media Text 1
Collect Text Data
Clean Text
Text Analysis
Visualization
Python, TCAT, Crimson Hexagon…
Remove Stop Words, Stemming
Replacing “/”, “@” and “|” with space
Convert the text to lower case
Remove punctuations
Frequency Analysis, Sentiment Analysis
Entity Detection, Topic Modeling
Word Clouds, Semantic Network Analysis
R, Gephi
Tell a Story
3. Step 1: Collect Text Data Using Crimson Hexagon
Log in Crimson Hexagon
4. Step 1: Collect Text Data Using Crimson Hexagon
Go to Forsight
5. Step 1: Collect Text Data Using Crimson Hexagon
Click @realDonaldTrump
6. Step 1: Collect Text Data Using Crimson Hexagon
Manage — Bulk Export
7. Step 1: Collect Text Data Using Crimson Hexagon
Export 10,000 posts
8. Step 1: Collect Text Data Using Crimson Hexagon
Open exported file — Save as trump.csv
9. Step 1: Collect Text Data Using Crimson Hexagon
Open trump.csv — Delete the first line
10. Step 2: Create/Load a text file
a. Set Working Directory — Directory for trump.csv
11. b. Get a Sample
Step 3: Create/Load a Text File
Output