INFLUXDB AND GRAFANA
fighting together with IoT data attack
Ivan Vaskevych @
https://www.easyitblog.info/
Path
What
Why
How
Plot some graphs
InfluxDB
MMXIII
Top ranking
source: https://db-engines.com/en/ranking/time+series+dbms
Ranking Driven (“R&D” )
source: https://db-engines.com/en/ranking/time+series+dbms
Why we use it
Coming soon…
Just how fast?
• Single-node c4.8xlarge AWS server (36 vCPUs,
60GB of RAM)
• Pushing data from another instance in AWS
• Result:
loaded 3888000 items in 9.311872sec with 32
workers (mean rate 417531.503762/sec,
180.18MB/sec from stdin)
InfluxDB vs PostgreSQL. Fight!
• Aggregation:
SELECT avg(value), stddev(value) FROM measurements
WHERE type = 'PM25' AND time BETWEEN 'XXX' AND
'YYY';
• Count:
SELECT count(*) FROM measurements WHERE type =
'PM25' AND time BETWEEN 'XXX' AND 'YYY';
InfluxDB vs PostgreSQL
Moar benchmark, plz…
https://www.influxdata.com/_resources/
Hosting on AWS
Hosting on AWS
sudo yum update
sudo reboot
cat <<EOF | sudo tee /etc/yum.repos.d/influxdb.repo
[influxdb]
name = InfluxDB Repository - RHEL
baseurl = https://repos.influxdata.com/rhel/7Server/x86_64/stable
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdb.key
EOF
sudo yum install influxdb
Horizontal scalability?
High Availability
influxdb-relay
Demo time!
Profit

InfluxDb and Grafana fighting with data

Editor's Notes

  • #2 Formula 1 car: 3TB per race. 200 sensors are packed It is predicted that autonomous cars will produce 4TB of data per day. - Realize this: 90% of the data in the world today has been created in the last two years. - You and me, as engineers, need to be prepared to fight with the sheer amount of data of different forms. - Today, I’m going to introduce to you the database that will allow you to store terrabytes of data per day, one a single server.
  • #4 First, we’re going to talk about…
  • #5 Enters InfluxDB (2013) – open source time-series database developed by InfluxData. Written in Go. In 2018, InfluxData closed a $35 million Series C round of funding
  • #6 Time-series: nuclear core temperature on Fukushima, your clicks that google collects, smart fridge temperature measurements. On the rise because of IoT and NoSQL and NewSQL. And increasing data LSM tree storage engine optimized for time series SQL-like query language HTTP query support Why use it? ………………………..
  • #7 Why use it? SQL-like interface: easy to use as will see later Quick to install and start using Big Community and adoption ….
  • #8 InfluxDB is at the lead of DB-Engines popularity ranking, with a huge gap to the second place.
  • #9 According to this rating MongoDB is the best Document Store. Right?
  • #10 5. It’s Used by IBM, Cisco, eBay
  • #11 So why did we at Airly decided to use InfluxDB? Airly receives a high volume of air pollution sensors data, and the amount was constantly growing. And InfluxDB turns out to be extremely fast ingesting data (thanks to [LSM Tree] storage engine optimized for time series data - TSM) That’s why I decided to migrate from Postgres to InfluxDB.
  • #16 And data took 19 time less disk space
  • #17 For some real benchmarks, you should check this resource. Influxdata – that’s the company who’s behind InfluxDB development.
  • #18 Let’s wrap up what we’ve learned so far: We saw that InfluxDB is… The reason for it’s existance is sensors data, logs, etc Learned that it’s pretty popular That it’s fast for writes and queries and efficient in terms of HDD space for sensor data. Especially comparing to PostgreSQL (which is great product, love it) Let’s say you’ve decided to give InfluxDB a go. How would you use it? …………………………
  • #19 Let’s say you want to host AWS has a Tick on Marketplace
  • #20 Or we do some manual installation.
  • #21 So far the ride was good with InfluxDB. There's a twist, though. The OSS version does not support replication and sharding. Only vertical. Which can cost
  • #22 - High availability is achievable using influxdb-relay - How does it work? - But that's one more layer and piece of infrastructure to manage.
  • #23 To wrap up this section: Easy to install and start using it Not easily scalable Harder make a production-ready system (monitoring?, tuning?, high-availability)
  • #24 - Some basic interactions with influx. - !HTTP! - I’ll now show you how you can easily visualize and analyze data hoarded in InfluxDB store. Enters Grafana. Grafana is an open source, feature rich graph editor. Out-of-the-box connector to InfluxDB. Now, I have some time left for questions. ! Q&A !
  • #25 Let us reiterate what we’ve learned today: InfluxDB is a young time-series database It’s designed for such data like real-time metrics, logs sensor data, etc. It’s pretty darn fast for this kind of data, especially on the write side That it’s easy to install and start using it That the free open-source version would not scale-out out-of-the-box That it’s SQL-like interface makes it easy for a quick start And, for the desert, we saw how to easily plot InfluxDB data graphs using Grafana. Now, database is a tool. And I want you to choose your tools carefully, check and test them thoroughly for your specific use case. Then build useful products based on them. Non-distributed is GOOD! As Einstein once said, „A person who never made a mistake never tried anything new”. InfluxDB if you have a valid usage. END…. ->