Le aziende sono chiamate a rispondere velocemente ai cambiamenti di mercato a seguito dell’affermarsi di nuovi scenari (Internet of Things, Social Analysis, manifattura 4.0 ecc.) che richiedono sempre di più l’integrazione con nuove tecnologie. Trattandosi di progetti innovativi, che impattano su processi e contesti aziendali, è fondamentale disporre di soluzioni flessibili per la raccolta e l’analisi dei dati.
MongoDB è un database NoSQL in grado di offrire flessibilità, scalabilità e semplificazione delle attività di sviluppo. Il lab avrà lo scopo di illustrare come creare architetture MongoDB e svolgere attività di Schema Design per la gestione dei dati in ambito IoT e Big Data facendo inoltre riferimento a casi pratici reali che si basano su tecnologie Cloud, necessarie a far fronte ad un mercato sempre più globale.
3. “We are an Innovation Company. We design
and develop cutting edge software to drive
our customers’ digital transformation,
through Agile Methodologies and continuous
delivery”
4. WE HELP OUR CUSTOMERS TO
DESIGN
IDEA
CREATE
PRODUCTS
EXTRACT VALUE
FROM DATA
We get powerful ideas
to market fast
We design and develop
innovative and better
software solutions
We collect and analyze
data to help your
decisions
7. “Internet of Things is a neologism
referring to the extension of the
Internet to the world of objects
and concrete places.”
8. 2020 IoT Market Share
4
Billion
Connected People
$4
Trillion
Business
Opportunity
25+
Billion
Integrated systems
connected to the Web
50
Trillion
50GBs of data
Fonte: IDC
29. Definition
Set of values of a variable detected at different timestamps.
Time
t0 t1 t2 t3
f ( t0 )
f ( t1 )
f ( t2 )
f ( t3 )
30. Time Series Data is Everywhere
1. Financial markets pricing
2. Sensors (temperature, pressure, proximity)
3. Industrial Fleets (Location, velocity, operational)
4. Social Networks (status update)
5. System (server logs, application logs)
6. Mobile devices (calls, texts)
31. Time Series Data at a Higher Level
1. Widely applicable data model
2. Various schema and modeling options
3. Application requirements drive schema
design
32. Time Series - Schema Design
How to Use MongoDB in IoT Area
33. Designing for writing and reading
1. One document per event
2. One document per minute (average)
3. One document per minute (second)
4. One document per hour
34. One document per event
{
server: "server1",
load: 92,
ts: ISODate("2014-10-16T22:07:38.000-0500")
}
1. Relational-centric approach
2. Insert-driven workload
3. Aggregations computed at application-level
35. One document per minute (average)
{
server: "server1",
load_num: 92,
load_sum: 4500,
ts: ISODate("2014-10-16T22:07:00.000-0500")
}
1. Pre-aggregation to compute average per minute more easily
2. Update-driven workload
3. Minute-level resolution
36. One document per minute ( second )
{
server: "server1",
load: { 0: 15, 1: 20, ..., 58: 45, 59: 40 }
ts: ISODate("2014-10-16T22:07:00.000-0500")
}
1. Store per second data at minute level
2. Update-driven workload
3. Pre-allocate structure to avoid document moves
37. One document per hour ( by second )
{
server: "server1",
load: { 0: 15, 1: 20, ..., 3598: 45, 3599: 40 }
ts: ISODate("2014-10-16T22:00:00.000-0500")
}
1. Store per second data at hourly level
2. Update driven workload
3. Pre-allocate structure to avoid document moves
4. Updating the last second requires 3599 steps
38. One document per hour ( by second )
{
server: "server1",
load: {
0: {0: 15, ..., 59: 45}, ....
59: {0: 25, ..., 59: 75}
}
ts: ISODate("2014-10-16T22:00:00.000-0500")
}
1. Store per second data at hourly level with nesting
2. Update-driven workload
3. Pre-allocate structure to avoid document moves
4. Updating the last second requires 59+59 steps
39. Writing operation analysis
1. Example: data generated every second
2. Capturing data per minute requires:
- One document per event: 60 writes
- One document per minute: 1 write, 59 updates
3. Transition from “insert-driven” to “update-driven”
- Individual writes are smaller
- Performance and concurrency benefits
40. 1. Example: data generated every second
2. Reading data for a single hours requires:
- One document per event: 3600 reads
- One document per minute: 60 reads
3. Read performance is greatly improved:
- Fewer disk seeks
- Optimization with tuned block sizes and read ahead
Read operation analysis