This presentation shows the sequential step from the advent of Data mining, Data Warehousing to Pattern Warehousing which includes the present gaps and gives idea for future work & research in order to make the work more easy.
A brief study on Storage Area Network (SAN), SAN architecture & its importance. It focuses on the techniques and the technologies that have evolved around SAN & its Security.
NLP techniques used for Spell checking to recommend find error in the written word and also suggest a relevant word.
Algorithm: Jaccard Coefficient, The Levenshtein Distance
sensors are what we experience the most in our life. they are even working in our body in different aspects. they may be as eyes, ears, skin, tongue etc. when we combine them they make a network. it may be a human sensor network. but i have shared something interesting about wireless sensor networks.
Many technical communities are vigorously pursuing
research topics that contribute to the Internet of Things (IoT).
Nowadays, as sensing, actuation, communication, and control become
even more sophisticated and ubiquitous, there is a significant
overlap in these communities, sometimes from slightly different
perspectives. More cooperation between communities is encouraged.
To provide a basis for discussing open research problems in
IoT, a vision for how IoT could change the world in the
distant future is first presented. Then, eight key research topics
are enumerated and research problems within these topics are
discussed.
A brief study on Storage Area Network (SAN), SAN architecture & its importance. It focuses on the techniques and the technologies that have evolved around SAN & its Security.
NLP techniques used for Spell checking to recommend find error in the written word and also suggest a relevant word.
Algorithm: Jaccard Coefficient, The Levenshtein Distance
sensors are what we experience the most in our life. they are even working in our body in different aspects. they may be as eyes, ears, skin, tongue etc. when we combine them they make a network. it may be a human sensor network. but i have shared something interesting about wireless sensor networks.
Many technical communities are vigorously pursuing
research topics that contribute to the Internet of Things (IoT).
Nowadays, as sensing, actuation, communication, and control become
even more sophisticated and ubiquitous, there is a significant
overlap in these communities, sometimes from slightly different
perspectives. More cooperation between communities is encouraged.
To provide a basis for discussing open research problems in
IoT, a vision for how IoT could change the world in the
distant future is first presented. Then, eight key research topics
are enumerated and research problems within these topics are
discussed.
DAL 2004 GARANTIAMO L’APPLICAZIONE DEL METODO DI DETERMINAZIONE DEI TEMPI DI RIPARAZIONE E SOSTITUZIONE, A TUTELA DEI PROFESSIONISTI DELL’AUTO E DEL CONSUMATORE
Agile bringing Big Data & Analytics closerNitin Khattar
In todays modern world, the data has turned out to be the NUCLEUS of the Quantum Mechanics, Photon of the Light or as we say the core of every single invention/innovation. Whether it is the data generated out of Financial organizations, Stock Markets, Social Media or whether it is the eating habits, likes & dislikes of an individual. Whatever we do every day results in loads of useful data being generated
But, without a meaningful judgement, without giving labels, without attaching semantics to this data, this is nothing more than a big black hole. Here comes the role of Analytics, which helps giving Data its actual identity.
It is important for every organization to bridge this gap between Data & Analytics and help them come closer & work hand in hand. Here comes Agile as the solution to this problem
The Seven Basic Tools stand in contrast to more advanced statistical methods such as survey sampling, acceptance sampling, statistical hypothesis testing, design of experiments, multivariate analysis, and various methods developed in the field of operations research.
presentation on data mining for b.tech student or other . This topic is about data mining you can give in seminar and it is easy to edit and it look like made own . You can study from is ppt all important topic is give like (content, definition, techniques, kcc and so on.
My presentation at the http://neuroinformatics2017.org (Kuala Lumpur, Malaysia) on FAIR and FAIRsharing (previously BioSharing); metadata standards and their implementation by databases/repositories and adoption by journals' and funders' data policies.
Fully Exploiting Qualitative and Mixed Methods Data from Online SurveysShalin Hai-Jew
A wide range of contemporary research uses online surveys. This presentation provides an overview of ways to exploit survey-captured data for analysis. There will be a summary of basic survey and item analysis that may be achieved with survey data results. There will also be a range of tips for extracting, cleaning, structuring, and presenting both quantitative and qualitative data for data-consumer sense-making. The platform that will be used as an exemplar will be the Qualtrics survey platform, and two supporting tools used for analysis are Excel 2013 and NVivo 10. Real-world projects are used to demo these approaches—with principal investigator (PI) permission.
High Performance Computing and the Opportunity with Cognitive TechnologyIBM Watson
With the ability to reduce “time to insight” and accelerate research breakthroughs by providing immense computational power, high performance computing is becoming increasingly important in the marketplace. Meanwhile, cognitive technology has risen to prominence, similarly accelerating new insight, but through a very different approach - by analyzing previously ignored unstructured data, which accounts for 80% of new data created today.
By combining the powerful computing power of the HPC market, along with the machine learning, natural language processing, and even computer vision techniques found within cognitive technology, there is a huge opportunity to accelerate breakthroughs and enable better decision making than ever before.
Watch the replay of the webinar: https://www.youtube.com/watch?v=Hxgieboj3W0
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfEnterprise Wired
In this guide, we'll explore the key considerations and features to look for when choosing a Trusted analytics platform that meets your organization's needs and delivers actionable intelligence you can trust.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
The affect of service quality and online reviews on customer loyalty in the E...
An emerging step : Data Warehousing to Pattern Warehousing
1. MADHAV INSTITUTE OF TECHNOLOGY & SCIENCE,
GWALIOR(M.P.)
DEPARTMENT OF CSE/IT
A SYNOPSIS REPORT ON
MODEL FOR OPTIMAL PATTERN EXTRACTION FROM DIABETES METILLUS PATTERN WAREHOUSE (DMPW)
PATTERN WAREHOUSE USING PSO
BY :
HARSHITA S. JAIN
3. INTRODUCTION (ARRIVAL OF DATA MINING)
• IN 1990S, THE TERM “DATA MINING” APPEARED IN THE DATABASE COMMUNITY.
• RETAIL COMPANIES AND THE FINANCIAL COMMUNITY ARE USING DATA MINING
TO ANALYZE DATA AND RECOGNIZE TRENDS TO INCREASE THEIR CUSTOMER
BASE, PREDICT FLUCTUATIONS IN INTEREST RATES, STOCK PRICES,
CUSTOMER DEMAND ETC.
• EVENTUALLY THE APPLICATION DOMAIN OF DATA MINING IS EXPANDING.
• DATA MINING IS THE PROCESS OF EXTRACTING INFORMATION FROM LARGE
AMOUNT OF DATA WHICH ARE STORED IN HUGE REPOSITORIES.
4. INTRODUCTION
•TODAY’S WORLD PRODUCES AN ENORMOUS AMOUNT OF DATA IN A REGULAR BASIS
FROM VARIOUS SOURCES. DATA IN SUCH HUGE VOLUMES DO NOT CONSTITUTE
KNOWLEDGE I.E., THEY CANNOT BE DIRECTLY EXPLOITED BY HUMAN BEINGS AND
NO USEFUL INFORMATION CAN BE DEDUCED SIMPLY BY THEIR OBSERVATION. THUS,
MORE ELABORATE TECHNIQUES ARE REQUIRED IN ORDER TO EXTRACT THE HIDDEN
KNOWLEDGE AND MAKE THESE DATA VALUABLE TO THE END-USERS [4].
•DATA MINING WAS DEVELOPED TO HELP EXTRACT KNOWLEDGE FROM THE RAW
DATA, USING ALGORITHMS THAT COULD DISCOVER SEVERAL STATISTIC PROPERTIES
IN THE ORIGINAL DATA. DATA MINING PRODUCES RESULTS LIKE ASSOCIATION RULES,
CLUSTERS, DECISION TREES AND OTHER STRUCTURES THAT DESCRIBE
PROPERTIES OF THE RAW DATA.
•THE COMMON CHARACTERISTIC OF ALL THESE TECHNIQUES IS THAT BIG PORTIONS
OF THE AVAILABLE DATA ARE ABSTRACTED AND REPRESENTED BY A SMALL NUMBER
OF KNOWLEDGE CARRYING REPRESENTATIVES, WHICH WE CALL PATTERNS (TIWARI
& THAKUR, 2012). PATTERNS REPRESENT THE HUGE QUANTITY OF HETEROGENEOUS
DATA IN COMPACT AND RICH SEMANTICS WAY
5. DATA WAREHOUSE
• DATA WAREHOUSES ARE USED TO CONSOLIDATE DATA LOCATED IN
DISPARATE DATABASES. A DATA WAREHOUSE STORES LARGE QUANTITIES OF
DATA BY SPECIFIC CATEGORIES SO IT CAN BE MORE EASILY RETRIEVED,
INTERPRETED, AND SORTED BY USERS.
• WAREHOUSES ENABLE EXECUTIVES AND MANAGERS TO WORK WITH VAST
STORES OF TRANSACTIONAL OR OTHER DATA TO RESPOND FASTER TO
MARKETS AND MAKE MORE INFORMED BUSINESS DECISIONS. IT HAS BEEN
PREDICTED THAT EVERY BUSINESS WILL HAVE A DATA WAREHOUSE WITHIN
TEN YEARS. BUT MERELY STORING DATA IN A DATA WAREHOUSE DOES A
COMPANY LITTLE GOOD.
• COMPANIES WILL WANT TO LEARN MORE ABOUT THAT DATA TO IMPROVE
KNOWLEDGE OF CUSTOMERS AND MARKETS. THE COMPANY BENEFITS WHEN
MEANINGFUL TRENDS AND PATTERNS ARE EXTRACTED FROM THE DATA.
6. ISSUES RELATED TO DATA WAREHOUSE
• THE SIZE OF SINGLE DATA WAREHOUSE WAS QUITE LARGE . SO IT BECOMES
TEDIOUS TASK TO HANDLE THE MANAGEMENT OF DATA WAREHOUSE.
• FOR ANALYSIS PURPOSE BUSINESS ANALYST DEMANDS THE CONSOLIDATED
INFORMATION.
• EXPONENTIAL INCREASE IN DATA DAY BY DAY AND THE STORING COST DOES
NOT HOLD DATA WAREHOUSE AS THE BEST SOLUTION FOR THE PROBLEM .
• DESIRED PATTERNS ARE IN VOLATILE FORM IN DATA WAREHOUSE, SO EVEN
FOR SMALL ANALYSIS THE WHOLE PROCESS OF DATA MINING HAS TO BE
PERFORMED FOR OBTAINING CERTAIN RESULTS.
7. ADVENT OF PATTERN WAREHOUSE
• AS THE SIZE OF THE DATA WAREHOUSE IS GROWING DUE TO MASSIVE
INCREASE OF DATA, BUSINESS ANALYST ARE NOW NOT IN THE NEED OF HUGE
ANALYTICAL DATA BUT THEY ARE INTERESTED IN GETTING ONLY THE
RELEVANT PATTERNS HIDDEN WITHIN REPOSITORIES.
• AND SO THE CONCEPT OF PATTERN WAREHOUSE WAS INTRODUCED[1].
FIG : PROCESS OF KNOWLEDGE DISCOVERY FROM DATABASES
8. PATTERN WAREHOUSE & PATTERN MINING
• PATTERN WAREHOUSE IS A KIND OF REPOSITORY WHICH STORES THE
RELEVANT PATTERNS WHICH ARE THE REPRESENTATIVE OF THE
RELATIONSHIP THAT EXIST BETWEEN THE DATA ELEMENTS.
• PATTERN MINING IS PERFORMED UPON THE PATTERNS STORED IN PATTERN
WAREHOUSE FOR GENERATING ANALYTICAL OUTCOMES. THROUGH PATTERN
MINING THE ANALYST HAS TO DEAL WITH SMALL AMOUNT OF INFORMATION[7]
9. RECENT APPROACH
• THE RECENT APPROACH CONSIST OF AN EVOLUTIONARY ALGORITHM
(GENETIC ALGORITHM) WHICH WORKS UPON THE OPTIMIZATION ENGINE AND
GENERATES OPTIMAL PATTERNS FROM PATTERN WAREHOUSE[7].
• THE WORKFLOW TO OBTAIN OPTIMAL PATTERNS IS :
PATTERN WAREHOUSE OPTIMIZATION ENGINE REPOSITORY FOR OPTIMAL
PATTERNS
12. TAKING A STEP AHEAD OF EXISTING
APPROACH
LIMITATIONS IN USING GENETIC ALGORITHM :
• NO GUARENTEE TO GIVE GLOBAL OPTIMUM REGARDING FALSE FREQUENT
PATTERNS
• CANNOT ASSURE THAT THIS WILL GIVE CONSTANT OPTIMIZATION RESPONSE
TIME.
• CANNOT USE IN DYNAMIC PROBLEM.
• DOMAIN OF APPLICABILITY IS LIMITED.
13. PROPOSED METHODOLOGY
• PROPOSED AN ALGORITHM WHICH WORKS UPON THE OPTIMIZATION ENGINE
FOR GENERATING OPTIMAL PATTERNS FROM PATTERN WAREHOUSE. THE
PROPOSED ALGORITHM USES PARTICLE SWARM OPTIMIZATION..
• THE STEPS OF ALGORITHM STEP BY STEP AND THEN FINALLY DRAW A
FLOWCHART AND PROVIDES THE EXECUTION OF WHOLE PROCESS.
14. PARTICLE SWARM OPTIMIZATION
• PARTICLE SWARM OPTIMIZATION (PSO) IS A POPULATION BASED STOCHASTIC
OPTIMIZATION TECHNIQUE DEVELOPED BY DR. EBERHART AND DR. KENNEDY
IN 1995, INSPIRED BY SOCIAL BEHAVIOR OF BIRD FLOCKING OR FISH
SCHOOLING.
• THE SYSTEM IS INITIALIZED WITH A POPULATION OF RANDOM SOLUTIONS AND
SEARCHES FOR OPTIMA BY UPDATING GENERATIONS.
• IT USES A NUMBER OF AGENTS (PARTICLES) THAT CONSTITUTE A SWARM
MOVING AROUND IN THE SEARCH SPACE LOOKING FOR THE BEST SOLUTION.
• EACH PARTICLE IS TREATED AS A POINT IN A N-DIMENSIONAL SPACE WHICH
ADJUSTS ITS “FLYING” ACCORDING TO ITS OWN FLYING EXPERIENCE AS WELL
AS THE FLYING EXPERIENCE OF OTHER PARTICLES.
16. THEORETICAL ANALYSIS /EXPECTED
OUTCOME
BASIS OF CHOICE : WHILE GOING THROUGH VARIOUS RESEARCH PAPERS ON
COMPARISON BETWEEN DIFFERENT NATURE INSPIRED ALGORITHMS, PSO WAS
FOUNDED TO BE MORE EFFECTIVE AND VERSATILE.
COMPARISION BETWEEN GENETIC AND PARTICLE SWARM OPTIMIZATION :
• GA WAS DESIGNED BASICALLY FOR DISCRETE OPTIMIZATION WHERE BIT O
AND 1 ARE USED TO ENCODE DISCRETE DESIGN VARIABLES WHEREAS PSO WAS
DESIGNED FOR CONTINUOUS PROBLEMS AND CAN CHOOSE ANY VALUE TO
ENCODE DESIGN VARIABLES.
• UNLIKE GA, PSO IS DESIGNED TO SOLVE CONTINUOUS PROBLEM BUT IT WAS
MODIFIED LATER FOR DISCRETE OR BINARY OPTIMIZATION PROBLEMS AS WELL.
•GA SOLVES PROBLEMS WHERE HERE IS NO PREDETERMINED SHAPE, SIZE &
COMPLEXITY WHEREAS IN PSO THE SOURCE AND DESTINATION ARE NEED TO
DEFINE UNIQUELY AND CLEARLY.
17. FUTURE EXPLORATION MOTIVES
• TO TAKE ON THE ARCHITECTURAL ASPECTS OF THE PATTERN WAREHOUSE
AND TRY TO MAKE PATTERN RETRIEVAL MORE EFFICIENT AND SCALABLE.
18. REFERENCES
1. AGARWAL, V. AND TIWARI, A., “FROM DATA WAREHOUSE TO PATTERN WAREHOUSE: A
PROGRESSIVE STEP”, INTERNATIONAL JOURNAL OF ENGINEERING RESEARCH”, 2016, VOL. 5,
NO.4, PP: 249-252.
2. J. HAN AND M. KAMBER, “DATA MINING: CONCEPTS AND TECHNIQUES”, SECOND EDITION,
MORGAN KAUFMANN PUBLISHERS, SAN FRANCISCO, ELSEVIER, 2006.
3. A. TIWARI, R. K. GUPTA AND D. P. AGRAWAL, “A SURVEY ON FREQUENT PATTERN MINING:
CURRENT STATUS AND CHALLENGING ISSUES”,INFORMATION TECHNOLOGY JOURNAL, 9(7):1278-
1293, 2010.
4. TERROVITIS, M., & VASSILIADIS, P. (2003). ARCHITECTURE FOR PATTERN BASE MANAGEMENT
SYSTEMS. DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING. NATIONAL TECHNICAL
UNIVERSITY OF ATHENS.
5. TIWARI, V., & THAKUR, R. S. (2014). P2MS: A PHASE-WISE PATTERN MANAGEMENT SYSTEM FOR
PATTERN WAREHOUSE. INTERNATIONAL JOURNAL OF DATA MINING, MODELING AND
MANAGEMENT.
6. DUNHAM, M. H. 2006 DATA MINING: INTRODUCTORY AND ADVANCED TOPICS. PEARSON
EDUCATION.