Tugas Enterprise Resource Planning, semester 7 (2017/2018), STMIK Nusa Mandiri Jakarta. Rancangan UML Aplikasi Rental Mobil: Use Case Diagram, Activity Diagram, Sequence Diagram, dan Class Diagram. Silahkan download untuk melihat lebih jelas. Aplikasi pembuatan UML menggunakan Enterprise Architect.
Querying your database in natural language was a presentation done during PyData Silicon Valley 2014, based on the quepy software project. More information at:
http://pydata.org/sv2014/abstracts/#197
https://github.com/machinalis/quepy
Querying your database in natural language by Daniel Moisset PyData SV 2014PyData
Most end users can't write a database query, and yet, they often have the need to access information that keyword-based searches can't retrieve precisely. Lately, there's been an explosion of proprietary Natural Language Interfaces to knowledge databases, like Siri, Google Now and Wolfram Alpha. On the open side, huge knowledge bases like DBpedia and Freebase exists, but access to them is typically limited to using formal database query languages. We implemented Quepy as an approach to provide a solution for this problem. Quepy is an open source framework to transform Natural Language questions into semantic database queries that can be used with popular knowledge databases like, for example, DBPedia and Freebase. So instead of requiring end users to learn to write some query language, a Quepy Application can fills the gap, allowing end users to make their queries in "plain English". In this talk we would discuss the techniques used in Quepy, what additional work can be done, and its limitations.
Tugas Enterprise Resource Planning, semester 7 (2017/2018), STMIK Nusa Mandiri Jakarta. Rancangan UML Aplikasi Rental Mobil: Use Case Diagram, Activity Diagram, Sequence Diagram, dan Class Diagram. Silahkan download untuk melihat lebih jelas. Aplikasi pembuatan UML menggunakan Enterprise Architect.
Querying your database in natural language was a presentation done during PyData Silicon Valley 2014, based on the quepy software project. More information at:
http://pydata.org/sv2014/abstracts/#197
https://github.com/machinalis/quepy
Querying your database in natural language by Daniel Moisset PyData SV 2014PyData
Most end users can't write a database query, and yet, they often have the need to access information that keyword-based searches can't retrieve precisely. Lately, there's been an explosion of proprietary Natural Language Interfaces to knowledge databases, like Siri, Google Now and Wolfram Alpha. On the open side, huge knowledge bases like DBpedia and Freebase exists, but access to them is typically limited to using formal database query languages. We implemented Quepy as an approach to provide a solution for this problem. Quepy is an open source framework to transform Natural Language questions into semantic database queries that can be used with popular knowledge databases like, for example, DBPedia and Freebase. So instead of requiring end users to learn to write some query language, a Quepy Application can fills the gap, allowing end users to make their queries in "plain English". In this talk we would discuss the techniques used in Quepy, what additional work can be done, and its limitations.
Naive Bayes Classifier is a machine learning technique that is exceedingly useful to address several classification problems. It is often used as a baseline classifier to benchmark results. It is also used as a standalone classifier for tasks such as spam filtering where the naive assumption (conditional independence) made by the classifier seem reasonable. In this presentation we discuss the mathematical basis for the Naive Bayes and illustrate with examples
Vous en avez entendu parler et maintenant vous voulez en apprendre un peu plus, mais vous n’avez pas un doctorat en informatique. Pas de problèmes ! Dans cette présentation, vous apprendrez les bases de l’apprentissage automatisé (machine learning) à travers différents exemples. Vous connaîtrez les classificateurs Bayes, l’analyse de sentiment et les algorithmes génétiques. À la fin de cette présentation, vous en saurez un peu plus sur l’apprentissage automatisé, et comment les implémenter dans vos propres projets.
Reflected Intelligence: Lucene/Solr as a self-learning data systemTrey Grainger
What if your search engine could automatically tune its own domain-specific relevancy model? What if it could learn the important phrases and topics within your domain, automatically identify alternate spellings (synonyms, acronyms, and related phrases) and disambiguate multiple meanings of those phrases, learn the conceptual relationships embedded within your documents, and even use machine-learned ranking to discover the relative importance of different features and then automatically optimize its own ranking algorithms for your domain?
In this presentation, you’ll learn you how to do just that - to evolving Lucene/Solr implementations into self-learning data systems which are able to accept user queries, deliver relevance-ranked results, and automatically learn from your users’ subsequent interactions to continually deliver a more relevant experience for each keyword, category, and group of users.
Such a self-learning system leverages reflected intelligence to consistently improve its understanding of the content (documents and queries), the context of specific users, and the relevance signals present in the collective feedback from every prior user interaction with the system. Come learn how to move beyond manual relevancy tuning and toward a closed-loop system leveraging both the embedded meaning within your content and the wisdom of the crowds to automatically generate search relevancy algorithms optimized for your domain.
Monthly AI Tech Talks in Toronto 2019-08-28
https://www.meetup.com/aittg-toronto
The talk will cover the end-to-end details including contextual and linguistic feature extraction, vectorization, n-grams, topic modeling, named entity resolution which are based on concepts from mathematics, information retrieval and natural language processing. We will also be diving into more advanced feature engineering strategies such as word2vec, GloVe and fastText that leverage deep learning models.
In addition, attendees will learn how to combine NLP features with numeric and categorical features and analyze the feature importance from the resulting models.
The following libraries will be used to demonstrate the aforementioned feature engineering techniques: spaCy, Gensim, fasText and Keras in Python.
https://www.meetup.com/aittg-toronto/events/261940480/
A lot of people talk about Data Mining, Machine Learning and Big Data. It clearly must be important, right?
A lot of people are also trying to sell you snake oil - sometimes half-arsed and overpriced products or solutions promising a world of insight into your customers or users if you handover your data to them. Instead, trying to understanding your own data and what you could do with it, should be the first thing you’d be looking at.
In this talk, we’ll introduce some basic terminology about Data and Text Mining as well as Machine Learning and will have a look at what you can on your own to understand more about your data and discover patterns in your data.
Semantics2018 Zhang,Petrak,Maynard: Adapted TextRank for Term Extraction: A G...Johann Petrak
Slides for the talk about the paper:
Ziqi Zhang, Johann Petrak and Diana Maynard, 2018: Adapted TextRank for Term Extraction: A Generic Method of Improving Automatic Term Extraction Algorithms. Semantics-2018, Vienna, Austria
NLP - Prédictions de tags sur les questions StackoverflowFUMERY Michael
Présentation des techniques de NLP, Bag of words, TFIDF, modélisations supervisées et non-supervisées pour prédiction des tags automatiques sur les questions Stackoverflow
There are number of players that provide full text search feature, starting from embedded search to dedicated search servers [solr, sphinx, elasticsearch etc], but setting up and configuring them is a time consuming process and requires considerable knowledge of the tools.
What if we could get comparable search results using full text search capabilities of Postgres. Developers already have the working knowledge of the database, so this should come natural. In addition to that, it will be one less tool to manage.
Code: https://github.com/Syerram/postgres_search
Similar to Penerapan text mining menggunakan python (20)
Introduction to Machine Learning, Supervised Learning, Unsupervised Learning, Algorithms, K-Means, K-Nearest Neighbor, Support Vector Machine, DBSCAN, and Use Cases for each sectors in industries such as finance, e-commerce.
R is a language and environment for statistical computing and graphics. R is free, this slide is for beginner. start from the basic first. variables, data structure, reading data, chart, function, conditional statement, iteration, grouping, reshape, string operations.
Social media analysis is something that important when you want to know who are the center points of networks. then this slide help you analyzing social media connection using NetworkX
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
State of Artificial intelligence Report 2023kuntobimo2016
Artificial intelligence (AI) is a multidisciplinary field of science and engineering whose goal is to create intelligent machines.
We believe that AI will be a force multiplier on technological progress in our increasingly digital, data-driven world. This is because everything around us today, ranging from culture to consumer products, is a product of intelligence.
The State of AI Report is now in its sixth year. Consider this report as a compilation of the most interesting things we’ve seen with a goal of triggering an informed conversation about the state of AI and its implication for the future.
We consider the following key dimensions in our report:
Research: Technology breakthroughs and their capabilities.
Industry: Areas of commercial application for AI and its business impact.
Politics: Regulation of AI, its economic implications and the evolving geopolitics of AI.
Safety: Identifying and mitigating catastrophic risks that highly-capable future AI systems could pose to us.
Predictions: What we believe will happen in the next 12 months and a 2022 performance review to keep us honest.
2. Content
● Pengenalan Text Mining
● Bekerja dengan Regular Expression
● Dasar - dasar Natural Language Processing
● Text Classification
● Topic Modeling
5. Hal yang bisa dilakukan dengan teks
● Menguraikan teks
● Ekstrak informasi dari teks
● Mengklasifikasi dokumen teks
● Mencari teks dokumen yang relevan
● Analisis sentimen
11. Regular Expression
. : match any char
^: start of a string
$: end of string
[]: matches one of the set of char within []
[a-z]: matches one of the range of chars a,b,c,d, …, z
13. Definisi
● Kemampuan untuk memahami bahasa manusia
● Memahami bahasa manusia untuk mendapatkan informasi tentang kata-kata dan bagaimana
memahami struktur bahasa manusia
14. NLP Goals
● Menghitung kata
● Menemukan batas kalimat
● POS Tagging
● Menguraikan struktur kalimat (S + P + O + K)
● Mengidentifikasi sematic roles
● Mengidentifikasi entitas dalam kalimat
● Menemukan kata kepunyaan mana yang dimaksudkan ke entitas
15. NLTK
● Toolkit
● Open source
● Wrapper of scikit learn for NLP
● Terdapat beberapa korpus populer
23. Kategori mana yang tepat untuk teks dibawah ini?
http://ekonomi.kompas.com/read/2017/10/25/102555326/apbn-2018-diharapkan-bisa-menjadi-sentimen-positif
● Pembangunan
● Keuangan
● Politik
24. Penggunaan Text Classification
● Analisis Sentimen: apakah review film ini negatif atau positif
● Deteksi Spam: apakah email ini spam atau bukan?
● Identifikasi Topik: apakah berita ini topik teknologi, olahraga atau kesehatan?
● Spelling correction: bener atau benar?
26. Hal mendasar dalam klasifikasi
● Binary Classification
● Multi-class Classification
● Multi-label Classification
27. Text Features
● Kata
○ Most common words
○ Stop words
○ Normalization
○ Stemming / Lemmatization
● Kalimat
○ Pos Tagging
○ Struktur grammar
○ Kata yang similar
38. Penggunaan Semantic Similarity
● Semantic similarity is the practical, widely used approach to address
the natural language understanding issue in many core NLP tasks such
as paraphrase identification, Question Answering, Natural Language
Generation, and Intelligent Tutoring Systems
42. Topic Modeling
● Discovering hidden topical patterns that are present across the
collection
● Annotating documents according to these topics
● Using these annotations to organize, search and summarize texts