SlideShare a Scribd company logo
1 of 42
Download to read offline
Luiz Henrique Zambom Santana, D.Sc.
Perspectives on the use of data in
Agriculture
Agenda
● Motivation
● Challenges
● Opportunities
● Machine learning for agriculture
● Conclusions
Data management in Agriculture
Precision
agriculture is a
incredibly
complex data
environment
nowadays
Data management in Agriculture
Data, data and data…
The data integration objective is to "offer uniform
access to a set of autonomous and heterogeneous
data sources".
Doan, AnHai, Alon Halevy, and Zachary Ives. Principles of data integration. Elsevier, 2012.
Big Data
Latency and throughput
Examples: operations files
● Point data
● Very dense
● Planting, harvesting, and
applications
● Many different types of
files and formats
● Pieces of operations
delivered from different
sources
https://medium.com/leaf-agriculture/merge-of-files-into-operations-1e62726df64d
Examples: satellite imagery
● Raster data
● Very dense
● Different light spectrum
frequencies (bands)
● Many different types of files
and formats
● Different resolution
https://withleaf.io/en/blog/ndvi-vs-ndre/
Examples: weather
https://withleaf.io/en/blog/launching-leaf-weather/
● Point data
● Sparse (Interpolated)
● Many different properties
● Many different sources
A taxonomy for Precision Agriculture
A taxonomy for Precision Agriculture
https://www.linkedin.com/pulse/towards-taxonomy-data-precision-agriculture-time-luiz-henrique
Challenges - why not just use the standards?
● Many formats, mostly of
general purpose
● GIS and Big Data
● Lack of developers
trained
● The technology will not
wait to our standards
Challenges - Many types of data sources
● Irrigation
● Soil
● Weather
● FMIS
● …
● And what is still been developed will also need data
integration (robots, Ag BioTech)
Challenges
● Big areas + Big time = Big data
● Storing, processing, securing, making sense
What we are doing at Leaf Agriculture?
Leaf's API
● We help AgTechs
to access data
from different
sources
● Almost everything
is GIS
Leaf's API
Data integration
Data integration
What people is doing with this data?
● Crop Insurance:
○ https://withleaf.io/en/blog/3-reasons-to-use-mor
e-data-in-crop-insurance/
● Seeds improvements
● Farm management
○ Machines operations
○ Fields monitoring
● Farm financial organization
● Ecological assets management
● …
Opportunities
● Blockchain
○ Smart contracts
● Big data in general
○ Data lakes
■ Huge amount of data
■ Historical data is very important
○ NoSQL
■ Highly unstructured data
● GIS
○ Almost everything
Opportunities
● Data integration
○ Offer different versions of the same data:
■ Raw data
■ Standardized data
■ Cleaned
■ Operations
■ Image
○ Link the data:
■ Operations files relate to fields related to machines
related to the resources used
● Developer centric approach
○ Better documentation
○ Dismistify AgTech to developers (we are just starting)
How AI can Improve Agriculture
● For sure AI will improve all the
traditional fields of agriculture,
but the following has a huge
potential:
○ Robotics
○ ESG
○ Real Time insights
■ Hydric stress
■ Machine maintenance
■ Farm to fork
Robotics
https://www.youtube.com/watch?v=Ql8MbI2oXzM
● Involves many segments of ML/AI
including vision and sensors
● Benefits of the use of smart
machines in agriculture:
○ Reduce waste
○ Reduce resources
consumption, specially water
○ Reduce pollution
○ Improve safety and labor
quality for the workers
Real Time insights
● Provide short term insights for the
farmers will became crucial
● Examples: pest, hydric stress,
machinery breakdown
● This involves using a huge amount
of data into machine learn
pipelines and delivering insights
direct in the field
ESG
https://www.eea.europa.eu/data-and-maps/dashboards/emissions-trading-viewer-1
https://youtu.be/9746wy5BTtI
Somebody will code it :)
How to start with Leaf?
1. Register your account on:
https://withleaf.io/account/quickstart
2. We will provide sample data for satellite and
operations - we can help with more data upon
request
3. Start building:
○ https://learn.withleaf.io/
○ https://github.com/Leaf-Agriculture/Leaf-API-Po
stman-Collection
4. Register for free credits:
○ https://withleaf.io/account/startups
Ideas on how to organize a data pipeline for production
● Remmember: GIS + BigData (time series) + ML
● GeoJSON as format
● GIS manipulation
○ Python is the best
○ GDAL + GeoPandas
● BigData
○ Spark and Sedona
● Database
○ MongoDB
Infrastructure: Apache Sedona
Must watch: https://www.youtube.com/watch?v=YmYl4NGD2Ug&t=32s
Infrastructure: MongoDB
● Faster and more scalable than PostGIS
○ Specially true when you have a write load
● GeoJSON :)
● But avoid storing point data as much as possible,
better keep it in files using GeoJSON or GeoParquet
Infrastructure: emerging technologies
● GeoParquet
● SedonaDB = GIS as a service
● Kepler for visualization https://kepler.gl/
● …
● UFSC's Database group
Example: identify yield level in different
areas of a field
https://www.linkedin.com/pulse/creating-field-zones-apache-spark-using-leaf-data-luiz-henrique/
● Task:
○ Read a file with yield point data
(provided by Leaf in a GeoJSON
https://learn.withleaf.io/docs/operati
ons_sample_output#field-operation
s-filtered-geojson)
○ Classify the points using k-means
○ Generate polygons using Apache
Sedona buffer function
○ …
Example: identify yield level in different
areas of a field
https://www.linkedin.com/pulse/creating-field-zones-apache-spark-using-leaf-data-luiz-henrique/
● Task:
○ …
○ Discover which variable affected the
low yield areas using regression
○ Send an alert to the farm with the
property that is affecting the yield
Example: mixing Spark ML and Sedona
● Imagine a field where there is a harvest operation
● This field will have different levels of yield depending
on the point
● We can use Spark ML to group the points accordingly
to moisture
Example: mixing Spark ML and Sedona
○ Transform the RDD into a SpatialRDD
○ Run a SQL transformation using the buffer
function
Conclusions
● The data created in the farms is yet to be unlocked
● The standards are very important, but only part of
the solution
● Even with the best standards, there will always be a
gap for new data providers and use cases
https://withleaf.io/registration/
Where to learn more?
● Journals
○ AgFunder
○ Computers and Electronics in Agriculture
● Podcasts
○ AgTech… so what?
https://www.agtechsowhat.com/
○ AgTech Garage podcast
https://open.spotify.com/show/3MhDeWCL5ElGk
228WcCs9w
● Leaf's blog and webinars:
https://withleaf.io/en/blog
We are hiring!
https://withleaf.io/company/careers
References
● https://www.sciencedirect.com/science/article/pii/S0308521X21002511
● https://intellias.com/how-to-encourage-farmers-to-use-big-data-analytics-in-agriculture/
● https://valor.globo.com/brasil/noticia/2023/09/29/haddad-agro-vai-perder-se-ficar-fora-do-merc
ado-de-crdito-de-carbono.ghtml
● https://www.gov.br/fazenda/pt-br/assuntos/noticias/2023/junho/grupo-de-trabalho-interministeri
al-conclui-proposta-para-o-sistema-brasileiro-de-comercio-de-emissoes
● https://www.databricks.com/resources/ebook/tap-full-potential-llm
● https://www.confluent.io/events/current-2022/real-time-processing-of-spatial-data-using-kafka-s
treams/
Luiz Henrique Zambom Santana, D.Sc.
Perspectives on the use of data in
Agriculture
luiz@withleaf.io
The best way to build with farm data_

More Related Content

Similar to Perspectives on the use of data in Agriculture - Luiz Santana - Leaf Agriculture .pptx.pdf

NJ Wildlife Habitat Finder
NJ Wildlife Habitat FinderNJ Wildlife Habitat Finder
NJ Wildlife Habitat FinderDan Ford
 
DSD-Kampala 2023 Modelling in a data scarce environment - the story of HydroM...
DSD-Kampala 2023 Modelling in a data scarce environment - the story of HydroM...DSD-Kampala 2023 Modelling in a data scarce environment - the story of HydroM...
DSD-Kampala 2023 Modelling in a data scarce environment - the story of HydroM...Deltares
 
Hadoop Training Tutorial for Freshers
Hadoop Training Tutorial for FreshersHadoop Training Tutorial for Freshers
Hadoop Training Tutorial for Freshersrajkamaltibacademy
 
Protecting privacy in practice
Protecting privacy in practiceProtecting privacy in practice
Protecting privacy in practiceLars Albertsson
 
Big data processing systems research
Big data processing systems researchBig data processing systems research
Big data processing systems researchVasia Kalavri
 
The Rise of Cloud Computing Systems
The Rise of Cloud Computing SystemsThe Rise of Cloud Computing Systems
The Rise of Cloud Computing SystemsDaehyeok Kim
 
Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdfLars Albertsson
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartMukesh Singh
 
How to Develop and Operate Cloud Native Data Platforms and Applications
How to Develop and Operate Cloud Native Data Platforms and ApplicationsHow to Develop and Operate Cloud Native Data Platforms and Applications
How to Develop and Operate Cloud Native Data Platforms and ApplicationsAlluxio, Inc.
 
Spring Data Neo4j: Graph Power Your Enterprise Apps
Spring Data Neo4j: Graph Power Your Enterprise AppsSpring Data Neo4j: Graph Power Your Enterprise Apps
Spring Data Neo4j: Graph Power Your Enterprise AppsGraphAware
 
Cloud Cost Management and Apache Spark with Xuan Wang
Cloud Cost Management and Apache Spark with Xuan WangCloud Cost Management and Apache Spark with Xuan Wang
Cloud Cost Management and Apache Spark with Xuan WangDatabricks
 
Clearing Airflow Obstructions
Clearing Airflow ObstructionsClearing Airflow Obstructions
Clearing Airflow ObstructionsTatiana Al-Chueyr
 
DOC ROI Presentation 2pm NZ3 - Duane Wilkins
DOC ROI Presentation 2pm NZ3 - Duane WilkinsDOC ROI Presentation 2pm NZ3 - Duane Wilkins
DOC ROI Presentation 2pm NZ3 - Duane WilkinsDuane Wilkins
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistryOpen Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistryMarcus Hanwell
 
Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...terradue
 
Ceph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldCeph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldSage Weil
 

Similar to Perspectives on the use of data in Agriculture - Luiz Santana - Leaf Agriculture .pptx.pdf (20)

NJ Wildlife Habitat Finder
NJ Wildlife Habitat FinderNJ Wildlife Habitat Finder
NJ Wildlife Habitat Finder
 
BigData Hadoop
BigData Hadoop BigData Hadoop
BigData Hadoop
 
DSD-Kampala 2023 Modelling in a data scarce environment - the story of HydroM...
DSD-Kampala 2023 Modelling in a data scarce environment - the story of HydroM...DSD-Kampala 2023 Modelling in a data scarce environment - the story of HydroM...
DSD-Kampala 2023 Modelling in a data scarce environment - the story of HydroM...
 
Hadoop Training Tutorial for Freshers
Hadoop Training Tutorial for FreshersHadoop Training Tutorial for Freshers
Hadoop Training Tutorial for Freshers
 
Protecting privacy in practice
Protecting privacy in practiceProtecting privacy in practice
Protecting privacy in practice
 
Big data processing systems research
Big data processing systems researchBig data processing systems research
Big data processing systems research
 
The Rise of Cloud Computing Systems
The Rise of Cloud Computing SystemsThe Rise of Cloud Computing Systems
The Rise of Cloud Computing Systems
 
Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdf
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
How to Develop and Operate Cloud Native Data Platforms and Applications
How to Develop and Operate Cloud Native Data Platforms and ApplicationsHow to Develop and Operate Cloud Native Data Platforms and Applications
How to Develop and Operate Cloud Native Data Platforms and Applications
 
Spring Data Neo4j: Graph Power Your Enterprise Apps
Spring Data Neo4j: Graph Power Your Enterprise AppsSpring Data Neo4j: Graph Power Your Enterprise Apps
Spring Data Neo4j: Graph Power Your Enterprise Apps
 
Cloud Cost Management and Apache Spark with Xuan Wang
Cloud Cost Management and Apache Spark with Xuan WangCloud Cost Management and Apache Spark with Xuan Wang
Cloud Cost Management and Apache Spark with Xuan Wang
 
Scaling Your Data: Data Democratisation and DataOps
Scaling Your Data: Data Democratisation and DataOpsScaling Your Data: Data Democratisation and DataOps
Scaling Your Data: Data Democratisation and DataOps
 
Clearing Airflow Obstructions
Clearing Airflow ObstructionsClearing Airflow Obstructions
Clearing Airflow Obstructions
 
DOC ROI Presentation 2pm NZ3 - Duane Wilkins
DOC ROI Presentation 2pm NZ3 - Duane WilkinsDOC ROI Presentation 2pm NZ3 - Duane Wilkins
DOC ROI Presentation 2pm NZ3 - Duane Wilkins
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistryOpen Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
 
KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...
 
Ceph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldCeph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud world
 

More from Luiz Henrique Zambom Santana

De Arquiteto para Gerente: como debugar uma equipe
De Arquiteto para Gerente: como debugar uma equipeDe Arquiteto para Gerente: como debugar uma equipe
De Arquiteto para Gerente: como debugar uma equipeLuiz Henrique Zambom Santana
 
VoltDB: as vantagens e os desafios dos banco de dados NewSQL
VoltDB: as vantagens e os desafios dos banco de dados NewSQLVoltDB: as vantagens e os desafios dos banco de dados NewSQL
VoltDB: as vantagens e os desafios dos banco de dados NewSQLLuiz Henrique Zambom Santana
 
Uma visão sobre Fast-Data: Spark, VoltDB e Elasticsearch
Uma visão sobre Fast-Data: Spark, VoltDB e ElasticsearchUma visão sobre Fast-Data: Spark, VoltDB e Elasticsearch
Uma visão sobre Fast-Data: Spark, VoltDB e ElasticsearchLuiz Henrique Zambom Santana
 
Workload-Aware RDF Partitioning and SPARQL Query Caching for Massive RDF Gra...
Workload-Aware RDF Partitioning  and SPARQL Query Caching for Massive RDF Gra...Workload-Aware RDF Partitioning  and SPARQL Query Caching for Massive RDF Gra...
Workload-Aware RDF Partitioning and SPARQL Query Caching for Massive RDF Gra...Luiz Henrique Zambom Santana
 
A middleware for storing massive RDF graphs into NoSQL
A middleware for storing massive RDF graphs into NoSQLA middleware for storing massive RDF graphs into NoSQL
A middleware for storing massive RDF graphs into NoSQLLuiz Henrique Zambom Santana
 
A Workload-Aware Middleware for Storing Massive RDF Graphs into NoSQL Databases
A Workload-Aware Middleware for Storing Massive RDF Graphs into NoSQL DatabasesA Workload-Aware Middleware for Storing Massive RDF Graphs into NoSQL Databases
A Workload-Aware Middleware for Storing Massive RDF Graphs into NoSQL DatabasesLuiz Henrique Zambom Santana
 
Como modelar, integrar e desenvolver aplicações com múltiplos bancos de dados...
Como modelar, integrar e desenvolver aplicações com múltiplos bancos de dados...Como modelar, integrar e desenvolver aplicações com múltiplos bancos de dados...
Como modelar, integrar e desenvolver aplicações com múltiplos bancos de dados...Luiz Henrique Zambom Santana
 
Novidades do elasticsearch 2.0 e como usá-lo com PHP
Novidades do elasticsearch 2.0 e como usá-lo com PHPNovidades do elasticsearch 2.0 e como usá-lo com PHP
Novidades do elasticsearch 2.0 e como usá-lo com PHPLuiz Henrique Zambom Santana
 
Design of Experiments on Federator Polystore Architecture
Design of Experiments on Federator Polystore ArchitectureDesign of Experiments on Federator Polystore Architecture
Design of Experiments on Federator Polystore ArchitectureLuiz Henrique Zambom Santana
 
An Approach for RDF-based Semantic Access to NoSQL Repositories
An Approach for RDF-based Semantic Access to NoSQL RepositoriesAn Approach for RDF-based Semantic Access to NoSQL Repositories
An Approach for RDF-based Semantic Access to NoSQL RepositoriesLuiz Henrique Zambom Santana
 

More from Luiz Henrique Zambom Santana (20)

De Arquiteto para Gerente: como debugar uma equipe
De Arquiteto para Gerente: como debugar uma equipeDe Arquiteto para Gerente: como debugar uma equipe
De Arquiteto para Gerente: como debugar uma equipe
 
VoltDB: as vantagens e os desafios dos banco de dados NewSQL
VoltDB: as vantagens e os desafios dos banco de dados NewSQLVoltDB: as vantagens e os desafios dos banco de dados NewSQL
VoltDB: as vantagens e os desafios dos banco de dados NewSQL
 
IBM Watson, Apache Spark ou TensorFlow?
IBM Watson, Apache Spark ou TensorFlow?IBM Watson, Apache Spark ou TensorFlow?
IBM Watson, Apache Spark ou TensorFlow?
 
Uma visão sobre Fast-Data: Spark, VoltDB e Elasticsearch
Uma visão sobre Fast-Data: Spark, VoltDB e ElasticsearchUma visão sobre Fast-Data: Spark, VoltDB e Elasticsearch
Uma visão sobre Fast-Data: Spark, VoltDB e Elasticsearch
 
Banco de dados nas nuvens - aula 3
Banco de dados nas nuvens - aula 3Banco de dados nas nuvens - aula 3
Banco de dados nas nuvens - aula 3
 
Banco de dados nas nuvens - aula 2
Banco de dados nas nuvens - aula 2Banco de dados nas nuvens - aula 2
Banco de dados nas nuvens - aula 2
 
Banco de dados nas nuvens - aula 1
Banco de dados nas nuvens - aula 1Banco de dados nas nuvens - aula 1
Banco de dados nas nuvens - aula 1
 
Workload-Aware RDF Partitioning and SPARQL Query Caching for Massive RDF Gra...
Workload-Aware RDF Partitioning  and SPARQL Query Caching for Massive RDF Gra...Workload-Aware RDF Partitioning  and SPARQL Query Caching for Massive RDF Gra...
Workload-Aware RDF Partitioning and SPARQL Query Caching for Massive RDF Gra...
 
A middleware for storing massive RDF graphs into NoSQL
A middleware for storing massive RDF graphs into NoSQLA middleware for storing massive RDF graphs into NoSQL
A middleware for storing massive RDF graphs into NoSQL
 
A Workload-Aware Middleware for Storing Massive RDF Graphs into NoSQL Databases
A Workload-Aware Middleware for Storing Massive RDF Graphs into NoSQL DatabasesA Workload-Aware Middleware for Storing Massive RDF Graphs into NoSQL Databases
A Workload-Aware Middleware for Storing Massive RDF Graphs into NoSQL Databases
 
Normalização
NormalizaçãoNormalização
Normalização
 
SQL Joins
SQL JoinsSQL Joins
SQL Joins
 
Consultas básicas em SQL
Consultas básicas em SQLConsultas básicas em SQL
Consultas básicas em SQL
 
Processamento em Big Data
Processamento em Big DataProcessamento em Big Data
Processamento em Big Data
 
Seminário de Andamento de Doutorado
Seminário de Andamento de DoutoradoSeminário de Andamento de Doutorado
Seminário de Andamento de Doutorado
 
Como modelar, integrar e desenvolver aplicações com múltiplos bancos de dados...
Como modelar, integrar e desenvolver aplicações com múltiplos bancos de dados...Como modelar, integrar e desenvolver aplicações com múltiplos bancos de dados...
Como modelar, integrar e desenvolver aplicações com múltiplos bancos de dados...
 
Workshop de ELK - EmergiNet
Workshop de ELK - EmergiNetWorkshop de ELK - EmergiNet
Workshop de ELK - EmergiNet
 
Novidades do elasticsearch 2.0 e como usá-lo com PHP
Novidades do elasticsearch 2.0 e como usá-lo com PHPNovidades do elasticsearch 2.0 e como usá-lo com PHP
Novidades do elasticsearch 2.0 e como usá-lo com PHP
 
Design of Experiments on Federator Polystore Architecture
Design of Experiments on Federator Polystore ArchitectureDesign of Experiments on Federator Polystore Architecture
Design of Experiments on Federator Polystore Architecture
 
An Approach for RDF-based Semantic Access to NoSQL Repositories
An Approach for RDF-based Semantic Access to NoSQL RepositoriesAn Approach for RDF-based Semantic Access to NoSQL Repositories
An Approach for RDF-based Semantic Access to NoSQL Repositories
 

Recently uploaded

DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage examplePragyanshuParadkar1
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture designssuser87fa0c1
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Effects of rheological properties on mixing
Effects of rheological properties on mixingEffects of rheological properties on mixing
Effects of rheological properties on mixingviprabot1
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 

Recently uploaded (20)

DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage example
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture design
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Effects of rheological properties on mixing
Effects of rheological properties on mixingEffects of rheological properties on mixing
Effects of rheological properties on mixing
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 

Perspectives on the use of data in Agriculture - Luiz Santana - Leaf Agriculture .pptx.pdf

  • 1. Luiz Henrique Zambom Santana, D.Sc. Perspectives on the use of data in Agriculture
  • 2. Agenda ● Motivation ● Challenges ● Opportunities ● Machine learning for agriculture ● Conclusions
  • 3. Data management in Agriculture Precision agriculture is a incredibly complex data environment nowadays
  • 4. Data management in Agriculture
  • 5. Data, data and data… The data integration objective is to "offer uniform access to a set of autonomous and heterogeneous data sources". Doan, AnHai, Alon Halevy, and Zachary Ives. Principles of data integration. Elsevier, 2012.
  • 8. Examples: operations files ● Point data ● Very dense ● Planting, harvesting, and applications ● Many different types of files and formats ● Pieces of operations delivered from different sources https://medium.com/leaf-agriculture/merge-of-files-into-operations-1e62726df64d
  • 9. Examples: satellite imagery ● Raster data ● Very dense ● Different light spectrum frequencies (bands) ● Many different types of files and formats ● Different resolution https://withleaf.io/en/blog/ndvi-vs-ndre/
  • 10. Examples: weather https://withleaf.io/en/blog/launching-leaf-weather/ ● Point data ● Sparse (Interpolated) ● Many different properties ● Many different sources
  • 11. A taxonomy for Precision Agriculture
  • 12. A taxonomy for Precision Agriculture https://www.linkedin.com/pulse/towards-taxonomy-data-precision-agriculture-time-luiz-henrique
  • 13. Challenges - why not just use the standards? ● Many formats, mostly of general purpose ● GIS and Big Data ● Lack of developers trained ● The technology will not wait to our standards
  • 14. Challenges - Many types of data sources ● Irrigation ● Soil ● Weather ● FMIS ● … ● And what is still been developed will also need data integration (robots, Ag BioTech)
  • 15. Challenges ● Big areas + Big time = Big data ● Storing, processing, securing, making sense
  • 16. What we are doing at Leaf Agriculture?
  • 17. Leaf's API ● We help AgTechs to access data from different sources ● Almost everything is GIS
  • 21. What people is doing with this data? ● Crop Insurance: ○ https://withleaf.io/en/blog/3-reasons-to-use-mor e-data-in-crop-insurance/ ● Seeds improvements ● Farm management ○ Machines operations ○ Fields monitoring ● Farm financial organization ● Ecological assets management ● …
  • 22. Opportunities ● Blockchain ○ Smart contracts ● Big data in general ○ Data lakes ■ Huge amount of data ■ Historical data is very important ○ NoSQL ■ Highly unstructured data ● GIS ○ Almost everything
  • 23. Opportunities ● Data integration ○ Offer different versions of the same data: ■ Raw data ■ Standardized data ■ Cleaned ■ Operations ■ Image ○ Link the data: ■ Operations files relate to fields related to machines related to the resources used ● Developer centric approach ○ Better documentation ○ Dismistify AgTech to developers (we are just starting)
  • 24. How AI can Improve Agriculture ● For sure AI will improve all the traditional fields of agriculture, but the following has a huge potential: ○ Robotics ○ ESG ○ Real Time insights ■ Hydric stress ■ Machine maintenance ■ Farm to fork
  • 25. Robotics https://www.youtube.com/watch?v=Ql8MbI2oXzM ● Involves many segments of ML/AI including vision and sensors ● Benefits of the use of smart machines in agriculture: ○ Reduce waste ○ Reduce resources consumption, specially water ○ Reduce pollution ○ Improve safety and labor quality for the workers
  • 26. Real Time insights ● Provide short term insights for the farmers will became crucial ● Examples: pest, hydric stress, machinery breakdown ● This involves using a huge amount of data into machine learn pipelines and delivering insights direct in the field
  • 28. How to start with Leaf? 1. Register your account on: https://withleaf.io/account/quickstart 2. We will provide sample data for satellite and operations - we can help with more data upon request 3. Start building: ○ https://learn.withleaf.io/ ○ https://github.com/Leaf-Agriculture/Leaf-API-Po stman-Collection 4. Register for free credits: ○ https://withleaf.io/account/startups
  • 29. Ideas on how to organize a data pipeline for production ● Remmember: GIS + BigData (time series) + ML ● GeoJSON as format ● GIS manipulation ○ Python is the best ○ GDAL + GeoPandas ● BigData ○ Spark and Sedona ● Database ○ MongoDB
  • 30. Infrastructure: Apache Sedona Must watch: https://www.youtube.com/watch?v=YmYl4NGD2Ug&t=32s
  • 31. Infrastructure: MongoDB ● Faster and more scalable than PostGIS ○ Specially true when you have a write load ● GeoJSON :) ● But avoid storing point data as much as possible, better keep it in files using GeoJSON or GeoParquet
  • 32. Infrastructure: emerging technologies ● GeoParquet ● SedonaDB = GIS as a service ● Kepler for visualization https://kepler.gl/ ● … ● UFSC's Database group
  • 33. Example: identify yield level in different areas of a field https://www.linkedin.com/pulse/creating-field-zones-apache-spark-using-leaf-data-luiz-henrique/ ● Task: ○ Read a file with yield point data (provided by Leaf in a GeoJSON https://learn.withleaf.io/docs/operati ons_sample_output#field-operation s-filtered-geojson) ○ Classify the points using k-means ○ Generate polygons using Apache Sedona buffer function ○ …
  • 34. Example: identify yield level in different areas of a field https://www.linkedin.com/pulse/creating-field-zones-apache-spark-using-leaf-data-luiz-henrique/ ● Task: ○ … ○ Discover which variable affected the low yield areas using regression ○ Send an alert to the farm with the property that is affecting the yield
  • 35. Example: mixing Spark ML and Sedona ● Imagine a field where there is a harvest operation ● This field will have different levels of yield depending on the point ● We can use Spark ML to group the points accordingly to moisture
  • 36. Example: mixing Spark ML and Sedona ○ Transform the RDD into a SpatialRDD ○ Run a SQL transformation using the buffer function
  • 37. Conclusions ● The data created in the farms is yet to be unlocked ● The standards are very important, but only part of the solution ● Even with the best standards, there will always be a gap for new data providers and use cases https://withleaf.io/registration/
  • 38. Where to learn more? ● Journals ○ AgFunder ○ Computers and Electronics in Agriculture ● Podcasts ○ AgTech… so what? https://www.agtechsowhat.com/ ○ AgTech Garage podcast https://open.spotify.com/show/3MhDeWCL5ElGk 228WcCs9w ● Leaf's blog and webinars: https://withleaf.io/en/blog
  • 40. References ● https://www.sciencedirect.com/science/article/pii/S0308521X21002511 ● https://intellias.com/how-to-encourage-farmers-to-use-big-data-analytics-in-agriculture/ ● https://valor.globo.com/brasil/noticia/2023/09/29/haddad-agro-vai-perder-se-ficar-fora-do-merc ado-de-crdito-de-carbono.ghtml ● https://www.gov.br/fazenda/pt-br/assuntos/noticias/2023/junho/grupo-de-trabalho-interministeri al-conclui-proposta-para-o-sistema-brasileiro-de-comercio-de-emissoes ● https://www.databricks.com/resources/ebook/tap-full-potential-llm ● https://www.confluent.io/events/current-2022/real-time-processing-of-spatial-data-using-kafka-s treams/
  • 41. Luiz Henrique Zambom Santana, D.Sc. Perspectives on the use of data in Agriculture luiz@withleaf.io
  • 42. The best way to build with farm data_