SlideShare a Scribd company logo
1 of 43
Big Data y geoposicionamiento
or
it’s bigger on the inside
Jorge López-Malla Matute
Senior Data Engineer
1. Presentation
2. What does Key Value means and why does it matters so
much?
3. Why do we need Geopositioning analytics?
4. How can we merge these two worlds?
5. Q&A
Index
Presentation
SKILLS
JORGE LÓPEZ-MALLA
@jorgelopezmalla
Arquitecto Big Data, certificado número
13 de Spark, riojano y miope.
Después de años tratando de solventar
problemas modernos con tecnologías
tradicionales lo intenté con el Big Data
y, ¡vi que lo resolvían!
What we do
Geoblink is the ultimate location
Intelligence solution that helps companies
of any size make strategic, location-related
decisions on an easy-to-use platform
COLLECTING
DATA
We combine our
client’s internal data
with external data and
Geoblink’s proprietary
location data
TRANSFORMING
DATA
We process and analyze
data using advanced
analytics (big data) and
artificial intelligence
techniques
PROVIDING
INSIGHTS
We present insights on a
user-friendly platform to
help companies make
powerful, data-driven
decisions
How we do it
What does “Key Value” mean
and why does it matters so
much?
● Big Data was born in the early 2000s
● Data is no longer small enough to fit in a single commodity
machine
● Data grows exponentially
● Vertical scaling is both dangerous and expensive
A little bit of history
● Solutions?
3G
1G
15G
15G
6G
12G
12G
12G
Processing & Storing
● Choosing a proper key is not only critical in a stored system but
also very important in distributed processing frameworks
● Spark, is probably the most important distributed processing
framework right now, is no exception
● Both important in streaming and batch processing
Why do we need Geopositioning
analytics?
The Five Ws are questions whose answers are considered basic in
information gathering or problem solving
● Who was involved?
● What happened?
● Why did that happen?
● When did it take place?
● Where did it take place?
Five W
● Digital society needs immediate reactions
● “Slows” responses are not useful anymore
● Big Data allows us to answer 4 of the 5 W questions
● Geospatial problem is not just an enterprise problem
The where matters
Real world
Business world
How can we merge these two worlds?
● Knowing both the problem to solve and technology should be
enough
● Obtaining the proper key is the “key” in every Big Data project
● In geospatial projects it is fundamental to obtain the results
exactly where we want
● Taking this in mind we should find the key to each record of our
dataset, easy … or not?
Merging worlds
It’s bigger on the inside
The real problem
● Remember: We should assign a key to a value using as few
logic as possible
● All geospatial logic must be understandable by humans
● The intuitive behaviour is to assign each point to a knowing
geospatial cardinality
The real problem
The real Problem
Intersection
● Each coordinate is not relevant by itself
● To assign each coordinate to a recognizable area we need both
geometries
● So we need to intersect the coordinates with the areas
Intersection
Intersection
● The intersect operation has a high computational cost
● We need to do this operation only in the cases that a
intersection is probable
● We need to find a key to reduce the operation cost
● First of all, there is no silver bullet
● The “key” problem is worse in the Geospatial world
● Both storing and processing technologies have similar problems
● Geospatial indexes help a lot
Finding a proper Key
Spatial partitioner
● Some Geospatial tech has been grouped by Eclipse in
locationtech
● Geospark and Magellan are spatial modules for Spark
● Although we only talk about Spark, other processing engines
have this functionality
● We have tested only processing engines but researched for
storage techs
Big Data initiatives
Processing engines
● Both Magellan and Geospark offer geospatial functionality
powered by Apache Spark
● Both allow us to use SparkSQL for Geospatial queries
● Both optimize the queries in Spark
● Geospark’s documentation is better than Magellan Spark
Processing engines
GeoSpark optimization
● Spatial joins allows us to assign several geometries to a
geometry
● Remember intersect operations came with a high cost
● In most use cases you only want a 1:1 mapping
● You can use Broadcast variables!
Do you really need a join?
Geomesa-Big Data storing
● Geomesa is an open-source project that allows performing
geospatial operations against several datasources and
processing engines
● Has connectors with visual tools (like Geoserver)
● We only tested Geomesa with Hbase and as a POC (yet
● We only have tested Geomesa as a POC
(U1 ,Madrid, Point(x1, y1))
(U2 ,Logroño, Point(x2, y2))
(U1 ,Cadiz, Point(x3, y3))
(U3 ,Logroño, Point(x4, y4))
Geomesa
(U1 ,Ávila, Point(x5, y5))
(U2 ,Huelva, Point(x6, y6))
(U3 ,Huelva, Point(x7, y7))
(U2 ,Logroño, Point(x8, y8))
HBase
Master
Spark Executor-1
Spark Executor-2
Point(x1, y1), [U1 ,Madrid]
Point(x5, y5), [U1 ,Ávila]
Point(x2, y2), [U21 ,Logroño]
Point(x4, y4), [U31 ,Logroño]
Point(x6, y6), [U1 ,Huelva]
Point(x7, y7), [U3 ,Huelva]
Point(x3, y3), [U1 ,Cadiz]
Point(x8, y8), [U21 ,Logroño]
Region Server-1
Region Server-2
Region Server-3
ECLQuery.toCQL(“people
between 1000, 200”)
Geomesa
HBase
Master
Client.java
Region Server-1
Region Server-2
Region Server-3
Point(x1, y1), [U1 ,Madrid]
Point(x5, y5), [U1 ,Ávila]
Point(x2, y2), [U21 ,Logroño]
Point(x4, y4), [U31 ,Logroño]
Point(x6, y6), [U1 ,Huelva]
Point(x7, y7), [U3 ,Huelva]
Point(x3, y3), [U1 ,Cadiz]
Point(x8, y8), [U21 ,Logroño]
Geomesa
HBase
Master
val dataFrame =
sparkSession.read
.format("geomesa")
.options(dsParams)
.option("geomesa.feat
ure", "spain")
.load()
Spark Driver
Region Server-1
Region Server-2
Region Server-3
Point(x1, y1), [U1 ,Madrid]
Point(x5, y5), [U1 ,Ávila]
Point(x2, y2), [U21 ,Logroño]
Point(x4, y4), [U31 ,Logroño]
Point(x6, y6), [U1 ,Huelva]
Point(x7, y7), [U3 ,Huelva]
Point(x3, y3), [U1 ,Cadiz]
Point(x8, y8), [U21 ,Logroño]
Takeaways
● We really need to give the insights in the proper location
● Big Data requires finding suitable key to our problem
● When dealing with big amount of data we have to aggregate it
● Spatial indexes are adecuate keys but they are not perfect
● If you only need to assign one geometry to another, a spatial
join is not a good idea
Q&A
Q&A
★ Job offers:
○ https://www.geoblink.com/work-with-us/
★ Contact:
○ jobs@geoblink.com
○ jlmalla@geoblink.com
Geoposicionamiento Big Data o It's bigger on the inside Commit conf 2018

More Related Content

Similar to Geoposicionamiento Big Data o It's bigger on the inside Commit conf 2018

GeoKettle: A powerful open source spatial ETL tool
GeoKettle: A powerful open source spatial ETL toolGeoKettle: A powerful open source spatial ETL tool
GeoKettle: A powerful open source spatial ETL tool
Thierry Badard
 
OpenGeoData Italia 2014 - Marco Fago "Infrastrutture di dati territoriali, IN...
OpenGeoData Italia 2014 - Marco Fago "Infrastrutture di dati territoriali, IN...OpenGeoData Italia 2014 - Marco Fago "Infrastrutture di dati territoriali, IN...
OpenGeoData Italia 2014 - Marco Fago "Infrastrutture di dati territoriali, IN...
giovannibiallo
 

Similar to Geoposicionamiento Big Data o It's bigger on the inside Commit conf 2018 (20)

What is spatial sql
What is spatial sqlWhat is spatial sql
What is spatial sql
 
Cloud accounting software uk
Cloud accounting software ukCloud accounting software uk
Cloud accounting software uk
 
How we use the massive open lidar dataset for the benfit of our clients
How we use the massive open lidar dataset for the benfit of our clientsHow we use the massive open lidar dataset for the benfit of our clients
How we use the massive open lidar dataset for the benfit of our clients
 
"What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual..."What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual...
 
GeoKettle: A powerful open source spatial ETL tool
GeoKettle: A powerful open source spatial ETL toolGeoKettle: A powerful open source spatial ETL tool
GeoKettle: A powerful open source spatial ETL tool
 
Is there a Future for devops ?
Is there a Future for devops   ? Is there a Future for devops   ?
Is there a Future for devops ?
 
Plenary Talk from GeCoWest ~ Best of Breed for Geospatial
Plenary Talk from GeCoWest ~ Best of Breed for GeospatialPlenary Talk from GeCoWest ~ Best of Breed for Geospatial
Plenary Talk from GeCoWest ~ Best of Breed for Geospatial
 
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
 
Tracking Task Context To Support Resumption
Tracking Task Context To Support ResumptionTracking Task Context To Support Resumption
Tracking Task Context To Support Resumption
 
Introducing GeoPySpark, a Big Data GeoSpatial Library
Introducing GeoPySpark, a Big Data GeoSpatial LibraryIntroducing GeoPySpark, a Big Data GeoSpatial Library
Introducing GeoPySpark, a Big Data GeoSpatial Library
 
Geospatial for Java
Geospatial for JavaGeospatial for Java
Geospatial for Java
 
Digital Graph tour Rome: "Connect the Dots, Lorenzo Speranzoni
Digital Graph tour Rome:  "Connect the Dots, Lorenzo SperanzoniDigital Graph tour Rome:  "Connect the Dots, Lorenzo Speranzoni
Digital Graph tour Rome: "Connect the Dots, Lorenzo Speranzoni
 
Geospatial Options in Apache Spark
Geospatial Options in Apache SparkGeospatial Options in Apache Spark
Geospatial Options in Apache Spark
 
Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016
 
OpenGeoData Italia 2014 - Marco Fago "Infrastrutture di dati territoriali, IN...
OpenGeoData Italia 2014 - Marco Fago "Infrastrutture di dati territoriali, IN...OpenGeoData Italia 2014 - Marco Fago "Infrastrutture di dati territoriali, IN...
OpenGeoData Italia 2014 - Marco Fago "Infrastrutture di dati territoriali, IN...
 
Groovy Finance
Groovy FinanceGroovy Finance
Groovy Finance
 
Tips about hibernate with spring data jpa
Tips about hibernate with spring data jpaTips about hibernate with spring data jpa
Tips about hibernate with spring data jpa
 
Logi Hacks: Tips & Tricks for Using Info
Logi Hacks: Tips & Tricks for Using InfoLogi Hacks: Tips & Tricks for Using Info
Logi Hacks: Tips & Tricks for Using Info
 
How to become a data scientist
How to become a data scientist How to become a data scientist
How to become a data scientist
 
ongc report
ongc reportongc report
ongc report
 

More from Jorge Lopez-Malla

More from Jorge Lopez-Malla (10)

Haz que tus datos sean sexys
Haz que tus datos sean sexysHaz que tus datos sean sexys
Haz que tus datos sean sexys
 
Mesos con europa 2017
Mesos con europa 2017Mesos con europa 2017
Mesos con europa 2017
 
Spark meetup barcelona
Spark meetup barcelonaSpark meetup barcelona
Spark meetup barcelona
 
Spark web meetup
Spark web meetupSpark web meetup
Spark web meetup
 
Kerberizing spark. Spark Summit east
Kerberizing spark. Spark Summit eastKerberizing spark. Spark Summit east
Kerberizing spark. Spark Summit east
 
Meetup spark + kerberos
Meetup spark + kerberosMeetup spark + kerberos
Meetup spark + kerberos
 
Codemotion 2016
Codemotion 2016Codemotion 2016
Codemotion 2016
 
Meetup errores en proyectos Big Data
Meetup errores en proyectos Big DataMeetup errores en proyectos Big Data
Meetup errores en proyectos Big Data
 
Apache Big Data Europa- How to make money with your own data
Apache Big Data Europa- How to make money with your own dataApache Big Data Europa- How to make money with your own data
Apache Big Data Europa- How to make money with your own data
 
Meetup Spark y la Combinación de sus Distintos Módulos
Meetup Spark y la Combinación de sus Distintos MódulosMeetup Spark y la Combinación de sus Distintos Módulos
Meetup Spark y la Combinación de sus Distintos Módulos
 

Recently uploaded

notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
Health
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
MayuraD1
 

Recently uploaded (20)

notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planes
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdf
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 

Geoposicionamiento Big Data o It's bigger on the inside Commit conf 2018

  • 1. Big Data y geoposicionamiento or it’s bigger on the inside Jorge López-Malla Matute Senior Data Engineer
  • 2. 1. Presentation 2. What does Key Value means and why does it matters so much? 3. Why do we need Geopositioning analytics? 4. How can we merge these two worlds? 5. Q&A Index
  • 4. SKILLS JORGE LÓPEZ-MALLA @jorgelopezmalla Arquitecto Big Data, certificado número 13 de Spark, riojano y miope. Después de años tratando de solventar problemas modernos con tecnologías tradicionales lo intenté con el Big Data y, ¡vi que lo resolvían!
  • 5. What we do Geoblink is the ultimate location Intelligence solution that helps companies of any size make strategic, location-related decisions on an easy-to-use platform
  • 6. COLLECTING DATA We combine our client’s internal data with external data and Geoblink’s proprietary location data TRANSFORMING DATA We process and analyze data using advanced analytics (big data) and artificial intelligence techniques PROVIDING INSIGHTS We present insights on a user-friendly platform to help companies make powerful, data-driven decisions How we do it
  • 7. What does “Key Value” mean and why does it matters so much?
  • 8. ● Big Data was born in the early 2000s ● Data is no longer small enough to fit in a single commodity machine ● Data grows exponentially ● Vertical scaling is both dangerous and expensive A little bit of history ● Solutions?
  • 9.
  • 10.
  • 11.
  • 13. Processing & Storing ● Choosing a proper key is not only critical in a stored system but also very important in distributed processing frameworks ● Spark, is probably the most important distributed processing framework right now, is no exception ● Both important in streaming and batch processing
  • 14. Why do we need Geopositioning analytics?
  • 15. The Five Ws are questions whose answers are considered basic in information gathering or problem solving ● Who was involved? ● What happened? ● Why did that happen? ● When did it take place? ● Where did it take place? Five W
  • 16. ● Digital society needs immediate reactions ● “Slows” responses are not useful anymore ● Big Data allows us to answer 4 of the 5 W questions ● Geospatial problem is not just an enterprise problem The where matters
  • 19. How can we merge these two worlds?
  • 20. ● Knowing both the problem to solve and technology should be enough ● Obtaining the proper key is the “key” in every Big Data project ● In geospatial projects it is fundamental to obtain the results exactly where we want ● Taking this in mind we should find the key to each record of our dataset, easy … or not? Merging worlds
  • 21. It’s bigger on the inside
  • 23. ● Remember: We should assign a key to a value using as few logic as possible ● All geospatial logic must be understandable by humans ● The intuitive behaviour is to assign each point to a knowing geospatial cardinality The real problem
  • 25. Intersection ● Each coordinate is not relevant by itself ● To assign each coordinate to a recognizable area we need both geometries ● So we need to intersect the coordinates with the areas
  • 27. Intersection ● The intersect operation has a high computational cost ● We need to do this operation only in the cases that a intersection is probable ● We need to find a key to reduce the operation cost
  • 28. ● First of all, there is no silver bullet ● The “key” problem is worse in the Geospatial world ● Both storing and processing technologies have similar problems ● Geospatial indexes help a lot Finding a proper Key
  • 30. ● Some Geospatial tech has been grouped by Eclipse in locationtech ● Geospark and Magellan are spatial modules for Spark ● Although we only talk about Spark, other processing engines have this functionality ● We have tested only processing engines but researched for storage techs Big Data initiatives
  • 32. ● Both Magellan and Geospark offer geospatial functionality powered by Apache Spark ● Both allow us to use SparkSQL for Geospatial queries ● Both optimize the queries in Spark ● Geospark’s documentation is better than Magellan Spark Processing engines
  • 34. ● Spatial joins allows us to assign several geometries to a geometry ● Remember intersect operations came with a high cost ● In most use cases you only want a 1:1 mapping ● You can use Broadcast variables! Do you really need a join?
  • 35. Geomesa-Big Data storing ● Geomesa is an open-source project that allows performing geospatial operations against several datasources and processing engines ● Has connectors with visual tools (like Geoserver) ● We only tested Geomesa with Hbase and as a POC (yet ● We only have tested Geomesa as a POC
  • 36. (U1 ,Madrid, Point(x1, y1)) (U2 ,Logroño, Point(x2, y2)) (U1 ,Cadiz, Point(x3, y3)) (U3 ,Logroño, Point(x4, y4)) Geomesa (U1 ,Ávila, Point(x5, y5)) (U2 ,Huelva, Point(x6, y6)) (U3 ,Huelva, Point(x7, y7)) (U2 ,Logroño, Point(x8, y8)) HBase Master Spark Executor-1 Spark Executor-2 Point(x1, y1), [U1 ,Madrid] Point(x5, y5), [U1 ,Ávila] Point(x2, y2), [U21 ,Logroño] Point(x4, y4), [U31 ,Logroño] Point(x6, y6), [U1 ,Huelva] Point(x7, y7), [U3 ,Huelva] Point(x3, y3), [U1 ,Cadiz] Point(x8, y8), [U21 ,Logroño] Region Server-1 Region Server-2 Region Server-3
  • 37. ECLQuery.toCQL(“people between 1000, 200”) Geomesa HBase Master Client.java Region Server-1 Region Server-2 Region Server-3 Point(x1, y1), [U1 ,Madrid] Point(x5, y5), [U1 ,Ávila] Point(x2, y2), [U21 ,Logroño] Point(x4, y4), [U31 ,Logroño] Point(x6, y6), [U1 ,Huelva] Point(x7, y7), [U3 ,Huelva] Point(x3, y3), [U1 ,Cadiz] Point(x8, y8), [U21 ,Logroño]
  • 38. Geomesa HBase Master val dataFrame = sparkSession.read .format("geomesa") .options(dsParams) .option("geomesa.feat ure", "spain") .load() Spark Driver Region Server-1 Region Server-2 Region Server-3 Point(x1, y1), [U1 ,Madrid] Point(x5, y5), [U1 ,Ávila] Point(x2, y2), [U21 ,Logroño] Point(x4, y4), [U31 ,Logroño] Point(x6, y6), [U1 ,Huelva] Point(x7, y7), [U3 ,Huelva] Point(x3, y3), [U1 ,Cadiz] Point(x8, y8), [U21 ,Logroño]
  • 39. Takeaways ● We really need to give the insights in the proper location ● Big Data requires finding suitable key to our problem ● When dealing with big amount of data we have to aggregate it ● Spatial indexes are adecuate keys but they are not perfect ● If you only need to assign one geometry to another, a spatial join is not a good idea
  • 40. Q&A
  • 41. Q&A
  • 42. ★ Job offers: ○ https://www.geoblink.com/work-with-us/ ★ Contact: ○ jobs@geoblink.com ○ jlmalla@geoblink.com