Netica is a Bayesian network modeling and inference software package developed by Norsys Software Corp. It allows users to build and evaluate causal probabilistic models known as Bayesian networks.
R: R is a programming language and software environment for statistical analysis, graphics, and statistical computing. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues.
Weka: Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules
Today the use of data is having a very revolutionized effect with
cultivatable land in decline demand for food increasing from
developing countries farmers.
Farmers who use data are capable of turning ordinary harvests into
bumper crops and profits behind.This is the precision agriculture hub connecting the world’s biggest agricultural businesses farmers and suppliers using integrated software solutions.
This is about survey the crop yield prediction using some data mining classification methods namely prdiction with classification,residue climate control, feature selection extraction, crop classification models,evaluation metrics, accuracy level,classification decision, result analysis,rain fall pH, principal component analysis, information gain
Internet of things (IoT) smart technology enables new digital agriculture. Technology has become necessary to address today's challenges, and many
sectors are automating their processes with the newest technologies. By maximizing fertiliser use to boost plant efficiency, smart agriculture, which is based on IoT technology, intends to assist producers and farmers in
reducing waste while improving output. With IoT-based smart farming, farmers may better manage their animals, develop crops, save costs, and
conserve resources. Climate monitoring, drought detection, agriculture and production, pollution distribution, and many more applications rely on the weather forecast. The accuracy of the forecast is determined by prior
weather conditions across broad areas and over long periods. Machine learning algorithms can help us to build a model with proper accuracy. As a result, increasing the output on the limited acreage is important. IoT smart farming is a high-tech method that allows people to cultivate crops cleanly
and sustainably. In agriculture, it is the use of current information and
communication technologies.
Today the use of data is having a very revolutionized effect with
cultivatable land in decline demand for food increasing from
developing countries farmers.
Farmers who use data are capable of turning ordinary harvests into
bumper crops and profits behind.This is the precision agriculture hub connecting the world’s biggest agricultural businesses farmers and suppliers using integrated software solutions.
This is about survey the crop yield prediction using some data mining classification methods namely prdiction with classification,residue climate control, feature selection extraction, crop classification models,evaluation metrics, accuracy level,classification decision, result analysis,rain fall pH, principal component analysis, information gain
Internet of things (IoT) smart technology enables new digital agriculture. Technology has become necessary to address today's challenges, and many
sectors are automating their processes with the newest technologies. By maximizing fertiliser use to boost plant efficiency, smart agriculture, which is based on IoT technology, intends to assist producers and farmers in
reducing waste while improving output. With IoT-based smart farming, farmers may better manage their animals, develop crops, save costs, and
conserve resources. Climate monitoring, drought detection, agriculture and production, pollution distribution, and many more applications rely on the weather forecast. The accuracy of the forecast is determined by prior
weather conditions across broad areas and over long periods. Machine learning algorithms can help us to build a model with proper accuracy. As a result, increasing the output on the limited acreage is important. IoT smart farming is a high-tech method that allows people to cultivate crops cleanly
and sustainably. In agriculture, it is the use of current information and
communication technologies.
Selection of crop varieties and yield prediction based on phenotype applying ...IJECEIAES
In India, agriculture plays an important role in the nation’s gross domestic product (GDP) and is also a part of civilization. Countries’ economies are also influenced by the amount of crop production. All business trading involves farming as a major factor. In order to increase crop production, different technological advancements are developed to acquire the information required for crop production. The proposed work is mainly focused on suitable crop selection across districts in Tamil Nadu, considering phenotype factors such as soil type, climatic factors, cropping season, and crop region. The key objective is to predict the suitable crop for the farmers based on their locations, soil types, and environmental factors. This results in less financial loss and a shorter crop production timeframe. Combined feature selection (CFS)-based machine regression helps increase crop production rates. A brief comparative analysis was also made between various machine learning (ML) regression algorithms, which majorly contributed to the process of crop selection considering phenotype factors. Stacked long short-term memory (LSTM) classifiers outperformed other decision tree (DT), k-nearest neighbor (KNN), and logistic regression (LR) with a prediction accuracy of 93% with the lowest classification accuracy metrics. The proposed method can help us select the perfect crop for maximum yield.
Paper presented International Conference on Data Science and Analytics - ICDSA'21 organized by Rathinam College of Arts and Science, Tamil Nadu, India on 19th February 2021
Authors are invited to submit theoretical or empirical papers in all aspects of management, including strategy, human resources, marketing, operations, technology, information systems, finance and accounting, business economics, and public sector management. IJMRR is an international forum for research that advances the theory and practice of management. The journal publishes original works with practical significance and academic value.
Big data solution in agriculture combined with data tools and software can revolutionize the agricultural industry. To help the farmer make better decisions, these technologies should be able to combine data on the climate, agronomy, water, farm machinery, supply chain, weeds, nutrients, and much more. By providing collections of most trusted applications we offer big data service in agriculture for data integration and data integrity.For any better decision farmers need to be able to access authentic information in the form of big data.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Selection of crop varieties and yield prediction based on phenotype applying ...IJECEIAES
In India, agriculture plays an important role in the nation’s gross domestic product (GDP) and is also a part of civilization. Countries’ economies are also influenced by the amount of crop production. All business trading involves farming as a major factor. In order to increase crop production, different technological advancements are developed to acquire the information required for crop production. The proposed work is mainly focused on suitable crop selection across districts in Tamil Nadu, considering phenotype factors such as soil type, climatic factors, cropping season, and crop region. The key objective is to predict the suitable crop for the farmers based on their locations, soil types, and environmental factors. This results in less financial loss and a shorter crop production timeframe. Combined feature selection (CFS)-based machine regression helps increase crop production rates. A brief comparative analysis was also made between various machine learning (ML) regression algorithms, which majorly contributed to the process of crop selection considering phenotype factors. Stacked long short-term memory (LSTM) classifiers outperformed other decision tree (DT), k-nearest neighbor (KNN), and logistic regression (LR) with a prediction accuracy of 93% with the lowest classification accuracy metrics. The proposed method can help us select the perfect crop for maximum yield.
Paper presented International Conference on Data Science and Analytics - ICDSA'21 organized by Rathinam College of Arts and Science, Tamil Nadu, India on 19th February 2021
Authors are invited to submit theoretical or empirical papers in all aspects of management, including strategy, human resources, marketing, operations, technology, information systems, finance and accounting, business economics, and public sector management. IJMRR is an international forum for research that advances the theory and practice of management. The journal publishes original works with practical significance and academic value.
Big data solution in agriculture combined with data tools and software can revolutionize the agricultural industry. To help the farmer make better decisions, these technologies should be able to combine data on the climate, agronomy, water, farm machinery, supply chain, weeds, nutrients, and much more. By providing collections of most trusted applications we offer big data service in agriculture for data integration and data integrity.For any better decision farmers need to be able to access authentic information in the form of big data.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Welocme to ViralQR, your best QR code generator.ViralQR
Welcome to ViralQR, your best QR code generator available on the market!
At ViralQR, we design static and dynamic QR codes. Our mission is to make business operations easier and customer engagement more powerful through the use of QR technology. Be it a small-scale business or a huge enterprise, our easy-to-use platform provides multiple choices that can be tailored according to your company's branding and marketing strategies.
Our Vision
We are here to make the process of creating QR codes easy and smooth, thus enhancing customer interaction and making business more fluid. We very strongly believe in the ability of QR codes to change the world for businesses in their interaction with customers and are set on making that technology accessible and usable far and wide.
Our Achievements
Ever since its inception, we have successfully served many clients by offering QR codes in their marketing, service delivery, and collection of feedback across various industries. Our platform has been recognized for its ease of use and amazing features, which helped a business to make QR codes.
Our Services
At ViralQR, here is a comprehensive suite of services that caters to your very needs:
Static QR Codes: Create free static QR codes. These QR codes are able to store significant information such as URLs, vCards, plain text, emails and SMS, Wi-Fi credentials, and Bitcoin addresses.
Dynamic QR codes: These also have all the advanced features but are subscription-based. They can directly link to PDF files, images, micro-landing pages, social accounts, review forms, business pages, and applications. In addition, they can be branded with CTAs, frames, patterns, colors, and logos to enhance your branding.
Pricing and Packages
Additionally, there is a 14-day free offer to ViralQR, which is an exceptional opportunity for new users to take a feel of this platform. One can easily subscribe from there and experience the full dynamic of using QR codes. The subscription plans are not only meant for business; they are priced very flexibly so that literally every business could afford to benefit from our service.
Why choose us?
ViralQR will provide services for marketing, advertising, catering, retail, and the like. The QR codes can be posted on fliers, packaging, merchandise, and banners, as well as to substitute for cash and cards in a restaurant or coffee shop. With QR codes integrated into your business, improve customer engagement and streamline operations.
Comprehensive Analytics
Subscribers of ViralQR receive detailed analytics and tracking tools in light of having a view of the core values of QR code performance. Our analytics dashboard shows aggregate views and unique views, as well as detailed information about each impression, including time, device, browser, and estimated location by city and country.
So, thank you for choosing ViralQR; we have an offer of nothing but the best in terms of QR code services to meet business diversity!
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
2. DEPARTMENT OF STUDIES IN COMPUTER SCIENCE
TOPIC:BIGDATA ANALYTICS IN AGRICULURE
SUBMITTED TO,
MANASA K N
DOS IN COMPUTER SCINCE
SUBMITTED BY,
TEJASHREE K
YAMUNA
RESHAD
SMITHA
DOS IN COMPUTER SCIENCE
2
6. BIG DATA :
Big data is an extensive collection of both structured and unstructured data that
can be mined for information and analyzed to build predictive systems for better
decision making.
Big data is consider as a large collection of dataset which is having high volume,
velocity, variety.
Volume: The amount of data generated
Velocity: How speed data is generated and processed
Variety : Variation of data with respect to the time
6
7. BIG DATA ANALYTICS IN AGRICULTURE
Big data analysis is successfully adopted to industries
like banking, insurance etc. Allthough agriculture did
not adopted big data analysis for past few years,
recently it is used to use it .
Big data analytics in agriculture can be studied under
two major areas: Smart farming
And precision agriculture.
Farmers use big data to get information on changing
weather, rainfall, fertilizer usage, and other factors
that impact the crop yield.
All of the information assists farmers in
making accurate and dependable decision that
maximize their productivity and cultivating the land.
7
8. The role of Big Data in the Agriculture industry
Big data in the agriculture industry is completely based
on using technology, information, and analytics to bring
useful information to farmers. Big data can be utilized for
grabbing information about the agriculture industry or it
can prove beneficial for any specific segment or area to
improve its efficiency. Data mining processes are utilized
by Big Data to create such vital information. With this
methodology, you can find the important patterns in a
huge set of data and condense this information into
useful forms. There are different modern systems, such
as artificial intelligence, machine learning statistics, and
more, that are used in the big data mechanism.
8
9. How is big data analytics transforming agriculture?
Boosting productivity – Data collected from GPS-
equipped tractors, soil sensors, and other external
sources has helped in better management of seeds,
pesticides, and fertilizers while increasing productivity to
feed the ever-increasing global population.
Access to plant genome information – This has allowed
the development of useful agronomic traits.
Predicting yields – Mathematical models and machine
learning are used to collate and analyze data obtained
from yield, chemicals, weather, and biomass index. The
use of sensors for data collection reduces erroneous
manual work and provides useful insights on yield
prediction.
Risk management– Data-driven farming has mitigated
crop failures arising due to changing weather patterns.
Food safety – Collection of data relating to temperature,
humidity, and chemicals, lowers the risk of food spoilage
by early detection of microbes and other contaminants.
9
11. HOW TO USE BIGDATA ANALYTICS IN
AGRICULTURE
To counter pressures of increasing food demand and climate changes,
policymakers and industry leaders are seeking assistants from technology
forces such as Iot,bigdata analytics and cloud computing.
IoT, devices helping first phase of this process –data collection ,sensor plugged
in tractors and trucks as well as in fields, soil and plants aid in the collection of
real time data directly from the ground.
Second, analysts integrate the large amount of data collected with other
information available in the cloud such as weather data pricing with models to
determine the patterns.
Finally, these patterns and insights assist in controlling the problem. They
helped to pinpoint existence issues, like operational inefficiencies and
problems.
11
12. The adoption of analytics in agriculture has been increasing consistently
;its market size is expected to grow from USD 585million in 2018 to USD
1236 million by 2023 at a compound annual rate (CAGR)of 16.2%.
12
13. USE OF BIG DATA IN AGRICULTURE
SECTOR
Top four use cases for big data on the form:
1.Feeding a growing population.
2.Using pesticides ethical.
3.Optimizing form equipment.
4.Managing supply chain.
13
14. 1. Feeding a growing population:
This is one of the key challenges that even governments are putting their
heads together to solve. One way to achieve this is to increase the yield from
existing farmlands.
Big data provides formers granular data on rainfall patterns, water cycles,
fertilizer requirements, and more. This enables them to make smart decisions,
such as what crops to plant for better profitability and when to harvest. The
right decisions ultimately improve farm yields.
14
15. 2. Using pesticides ethnically
Administration of pesticides has been a contentious issue due to its side effects
on the ecosystem. Big data allows farmers to manage this better by
recommending what pesticides to apply, when ,and by how much. By
monitoring it closely farmers can adhere to government regulations and avoid
overuse of chemicals in food production. Moreover, this leads to increased
profitability because crops don't get destroyed by weeds and insects
15
16. 3. Optimizing farm equipment
Companies like John Deere have integrated sensors in their farming
equipment and deployed big data applications that will help better
manage their fleet. For large farms, this level of monitoring can be a life
saver as it lets users know of tractor availability, service due dates, and
fuel refill alerts . In essence this optimizes usage and ensure the long -
term health of farm equipment.
16
17. 4. Managing supply chain issues:
McKinney reports that a third of food produced for human consumption
is lost or wasted every year. A devastating fact since the industry
struggles to bridge the gap between supply and demand. To address
this, food delivery cycle from producer to the market need to be reduced.
Big data can help achieve supply chain efficiencies by tracking and
optimizing delivery truck routes.
17
18. Analysis of agriculture data using data mining
techniques: application of big data
In agriculture sector where farmers and agribusinesses have to make
innumerable decisions every day and intricate complexities involves the
various factors influencing them. An essential issue for agricultural planning
intention is the accurate yield estimation for the numerous crops involved in
the planning. Data mining techniques are necessary approach for
accomplishing practical and effective solutions for this problem. Agriculture
has been an obvious target for big data. Environmental conditions, variability
in soil, input levels, combinations and commodity prices have made it all the
more relevant for farmers to use information and get help to make critical
farming decisions. This focuses on the analysis of the agriculture data and
finding optimal parameters to maximize the crop production using data
mining techniques like PAM, CLARA, DBSCAN and Multiple Linear Regression.
Mining the large amount of existing crop, soil and climatic data, and analyzing
new, non-experimental data optimizes the production and makes agriculture
more resilient to climatic change.
Big Data, PAM, CLARA
18
19. Input dataset consist of 6 year data with following parameters namely: year, State-
Karnataka (28 districts), District, crop (cotton, groundnut, jowar, rice and wheat.),
season (kharif, rabi, summer), area (in hectares), production (in tonnes), average
temperature (°C), average rainfall (mm), soil, PH value, soil type, major fertilizers,
nitrogen (kg/Ha), phosphorus (Kg/Ha),Potassium(Kg/Ha), minimum rainfall required,
minimum temperature required.
In proposed work, modified approach of DBSCAN method is used to cluster the data
based on districts which are having similar temperature, rain fall and soil type. PAM
and CLARA are used to cluster the data based on the districts which are producing
maximum crop production (In proposed work wheat crop is considered as example).
Based on these analyses we are obtaining the optimal parameters to produce the
maximum crop production. Multiple linear regression method is used to forecast the
annual crop yield
19
20. Partition around medoids (PAM)
It is a partitioning based algorithm. It breaks the input data into number of
groups. It finds a set of objects called medoids that are centrally located.
With the medoids, nearest data points can be calculated and made it as
clusters. The algorithm has two phases:
20
22. 1. BUILD phase, a collection of k objects are selected for an initial set S.
• Arbitrarily choose k objects as the initial medoids.
• Until no change, do.
–– (Re) assign each object to the cluster with the nearest medoid.
– Improve the quality of the k-medoids .
2. SWAP phase, one tries to improve the quality of the clustering by exchanging
selected objects with unselected objects. Choose the minimum swapping cost.
Example: For each medoid m1, for each non-medoid data point d; Swap m1 and d,
recomputed the cost (sum of distances of points to their medoid), if total cost of
the configuration increased in the previous step, undo the swap Fig. 2 depicts the
steps involved the PAM algorithms.
22
23. CLARA (clustering large applications)
CLARA (clustering large applications) It is designed by Kaufman and Rousseeuw
to handle large datasets, CLARA (clustering large applications) relies on sampling
. Instead of finding representative objects for the entire data set, CLARA draws a
sample of the data set, applies PAM on the sample, and finds the medoids of the
sample. To come up with better approximations, CLARA draws multiple samples
and gives the best clustering as the output. Here, for accuracy, the quality of the
clustering is measured based on the average dissimilarity of all objects in the
entire data set.
23
24. Multiple linear regression
to forecast the crop yield
Multiple linear regression is a variant of
“linear regression” analysis. Tis model is
built to establish the relationship that
exists between one dependent variable
and two or more independent variables
.For a given dataset where x1… xk are
independent variables and Y is a
dependent variable, the multiple linear
regression fts the dataset to the model:
yi = β0 + β1x1i + β2x2i +···+ βkxki + ε
24
25. is the y-intercept and β1, β2, ... , βk parameters are called the partial
coeffient.In matrix form
Y = XB + E Y ,
Before applying the multiple linear regression to forecast the crop yield,
it’s necessary to know the significant attributes from the database. All the
attributes used in the database will not be significant or changing the
value of these attributes will not affect anything on the dependent
variables. Such attributes can be neglected. P value test is performed on
the database to find the significant attributes and multiple linear
regression is applied only on the significant values to forecast the crop
yield
25
27. depicts the different districts of Karnataka which are
having similar temperature range, rain fall range and soil
types respectively..
27
28. PAM
PAM To apply the PAM algorithm on the dataset, initially user need to give
k (Number of clusters), where k is given as 3 in current experiment. Crop
yield is categorised into LOW, MODERATE and HIGH production. Total
districts are clustered into 3 clusters using PAM clustering method.
As a result of the analysis, North Karnataka districts such as Bijapur,
Dharwad, Bagalkot, Belgaum, Raichur, Bellary, Chitradurga and Davangere
are the districts which have maximum wheat crop production
28
32. Study and analysis of temperature and wheat crop
production in different districts of Karnataka as shown in
Fig. 12. From the Fig. 12, we can analyze that the optimal
temperature for Wheat crop
32
36. Conclusion
Various data mining techniques are implemented on the input data to
assess the best performance yielding method. The present work used data
mining techniques PAM, CLARA and DBSCAN to obtain the optimal climate
requirement of wheat like optimal range of best temperature, worst
temperature and rain fall to achieve higher production of wheat crop.
Clustering methods are compared using quality metrics. According to the
analyses of clustering quality metrics, DBSCAN gives the better clustering
quality than PAM and CLARA, CLARA gives the better clustering quality
than the PAM. Te proposed work can also be extended to analyze the soil
and other factors for the crop and to increase the crop production under
the different climatic conditions
36
37. Image Processing IM toolkit, VTK toolkit, OpenCv library
Machine Learning R, Google tenserflow, Apache Mahout, Weka, Mlpack
Cloud based platforms
for large scale
information storing
EMC corporation, MapR converged data platforms, Apache Pig,
Big Databases Hive, HadoopDB, Mango DB, Google big table, Cassandra, PostGIS
Statistical tools Norsys Netica, R, Weka
37
38. Image Processing Tool:
VTK toolkit:
The Visualization Toolkit (VTK) is open-source software for manipulating and
displaying scientific data. It comes with state-of-the-art tools for 3D
rendering, a suite of widgets for 3D interaction, and extensive 2D plotting
capability.
OpenCV:
OpenCV is a huge open-source library the computer vision, machine learning,
and image processing and now it plays a major role in real-time operation
which is very important in today’s systems.
38
39. Machine Learning tools:
R:
R analytics is data analytics using R programming language, an open-source
language used for statistical computing or graphics. This programming
language is often used in statistical analysis and data mining. It can be used
for analytics to identify patterns and build practical models.
Google TensorFlow:
TensorFlow is a free and open-source software library for machine learning
and artificial intelligence. It can be used across a range of tasks but has a
particular focus on training and inference of deep neural networks.
39
40. Machine Learning tools:
R:
R analytics is data analytics using R programming language, an open-source language used for statistical computing or graphics.
This programming language is often used in statistical analysis and data mining. It can be used for analytics to identify patterns and
build practical models.
Google Tensor Flow:
Tensor Flow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range
of tasks but has a particular focus on training and inference of deep neural networks.
Apache Mahout:
Apache Mahout is an open-source project to create scalable, machine-learning algorithms. Mahout operates in addition to Hadoop,
which allows you to apply the concept of machine learning via a selection of Mahout algorithms to distributed computing via
Hadoop.
40
41. Cloud-based platforms for large-scale information storing:
EMC Corporation:
EMC is a multinational provider of products and services related to cloud computing,
storage, big data, data analytics, information security, content management, and converged
infrastructure. EMC was acquired by Dell in September 2016 and the company was
renamed to Dell EMC.
MapR converged data platforms: A platform for all the data and applications. With MapR,
users have a single platform (on a single codebase!) that delivers data-wide convergence. It
is the only platform that has a distributed file system that supports storage and analytics of
data streams, files, and NoSQL tables in the same converged
Apache Pig:
Apache Pig is an abstraction over Map Reduce. It is a tool/platform which is used to analyze
larger sets of data representing them as data flows. Pig is generally used with Hadoop; we
can perform all the data manipulation operations in Hadoop using Apache Pig
41
42. Big Databases
Hive:
Hive allows users to read, write, and manage petabytes of data using SQL. Hive is built on top of Apache Hadoop, which
is an open-source framework used to efficiently store and process large datasets. As a result, Hive is closely integrated
with Hadoop, and is designed to work quickly on petabytes of data.
Hadoop DB:
Hadoop handles larger data sets but only writes data once. SQL is easier to use but more difficult to scale. Apache
Hadoop is an open-source framework that is used to efficiently store and process large datasets ranging in size from
gigabytes to petabytes of data. Instead of using one large computer to store and process the data, Hadoop allows
clustering multiple computers to analyze massive datasets in parallel more quickly.
Mango DB:
MongoDB is an open-source document-oriented database that is designed to store a large scale of data and also allows
you to work with that data very efficiently. It is categorized under the NoSQL (Not only SQL) database because the storage
and retrieval of data in the MongoDB are not in the form of tables.
Google big table:
Big table is ideal for storing large amounts of single-keyed data with low latency. It supports high read and writes
throughput at low latency, and it's an ideal data source for Map Reduce operations.
42
43. Statistical tools:
Norsys Netica: Netica is a powerful, easy-to-use, complete program for working with belief networks
and influence diagrams. It has an intuitive and smooth user interface for drawing the networks, and
the relationships between variables may be entered as individual probabilities, in the form of
equations, or learned from data files (which may be in ordinary tab-delimited form and have "missing
data").
Weka:
Weka is a collection of machine-learning algorithms for data mining tasks. The algorithms can either
be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-
processing, classification, regression, clustering, association rules, and visualization.
43
44. References:
The objective of proposed work is to analyses the agriculture data using
data mining In proposed work, agriculture data has been collected from
following sources: Dataset in agricultural sector [https://data.gov.in/,
http://raitamitra.kar.nic.in/ statistics].
Crop wise agriculture data [html://CROPWISE_NORMAL_AREA],
Agriculture data of different districts
[http://14.139.94.101/fertimeter/Distkar.aspx],http://raitamitra.kar.nic.in/EN
G/statistics.asp],
Agriculture data based on weather, temperature, and relative humidity
[http://dmc. kar.nic.in/trg.pdftechniques].
44