4. LOG
script to
parse data
cron job for
loading
filtering
script
syslog
script
Tweet-
fetching
script
aggregation
script
aggregation
script
script to
parse data
rsync
server
FILE
LOG
FILE
✓ Parse/Format data
✓ Buffering & Retries
✓ Load balancing
✓ Failover
Before
After
6. Middleware? : Fluentd
• Long running daemon process
• Compatibility for API, behavior and configuration files
• Multi platform / environment support
• Linux, Mac and Windows(!)
• Baremetal servers, Virtual machines, Containers
• Many use cases
• Various data, Various data formats, Unexpected errors
• Various traffic - small to huge
7. • Long running daemon process
• Compatibility for API, behavior and configuration files
• Multi platform / environment support
• Linux, Mac and Windows(!)
• Ruby, JRuby?, Rubinius?
• Baremetal servers, Virtual machines, Containers
• Many use cases
• Various data, Various data formats, Unexpected errors
• Various traffic - small to huge
Middleware? Batches:
Minutes - Hours
8. • Long running daemon process
• Compatibility for API, behavior and configuration files
• Multi platform / environment support
• Linux, Mac and Windows(!)
• Ruby, JRuby?, Rubinius?
• Baremetal servers, Virtual machines, Containers
• Many use cases
• Various data, Various data formats, Unexpected errors
• Various traffic - small to huge
Middleware? Providing APIs
and/or Client Libraries
9. • Long running daemon process
• Compatibility for API, behavior and configuration files
• Multi platform / environment support
• Linux, Mac and Windows(!)
• Ruby, JRuby?, Rubinius?
• Baremetal servers, Virtual machines, Containers
• Many use cases
• Various data, Various data formats, Unexpected errors
• Various traffic - small to huge
Middleware?
Daily Development
& Deployment
Providing Client Tools
10. • Long running daemon process
• Compatibility for API, behavior and configuration files
• Multi platform / environment support
• Linux, Mac and Windows(!)
• Ruby, JRuby?, Rubinius?
• Baremetal servers, Virtual machines, Containers
• Many use cases
• Various data, Various data formats, Unexpected errors
• Various traffic - small to huge
Middleware?
Make Your Application
Stable
11. • Long running daemon process
• Compatibility for API, behavior and configuration files
• Multi platform / environment support
• Linux, Mac and Windows(!)
• Ruby, JRuby?, Rubinius?
• Baremetal servers, Virtual machines, Containers
• Many use cases
• Various data, Various data formats, Unexpected errors
• Various traffic - small to huge
Middleware?
Make Your Application
Fast and Scalable
12. Case studies from
development of Fluentd
• Platform: Linux, Mac and Windows
• Resource: Memory usage and malloc
• Resource and Stability: Handling JSON
• Stability: Threads and exceptions
14. Linux and Mac:
Thread/process scheduling
• Both are UNIX-like systems...
• Mac (development), Linux (production)
• Test code must run on both!
• CI services provide multi-environment support
• Fluentd uses Travis CI :D
• Travis CI provides "os" option: "linux" & "osx"
• Important tests to be written: Threading
15. class MyTest < ::Test::Unit::TestCase
test 'yay 1' do
data = []
thr = Thread.new do
data << "line 1"
end
data << "line 2"
assert_equal ["line 1", "line 2"], data
end
end
class MyTest < ::Test::Unit::TestCase
test 'client sends 2 data' do
list = []
thr = Thread.new do # Mock server
TCPServer.open("127.0.0.1", 2048) do |server|
while sock = server.accept
list << sock.read.chomp
end
end
end
2.times do |i|
TCPSocket.open("127.0.0.1", 2048) do |client|
client.write "data #{i}"
end
end
assert_equal(["data 0", "data 1"], list)
end
end
16. Loaded suite example
Started
F
===========================================================================================
Failure: test: client sends 2 data(MyTest)
example.rb:22:in `block in <class:MyTest>'
19: end
20: end
21:
=> 22: assert_equal(["data 0", "data 1"], list)
23: end
24: end
<["data 0", "data 1"]> expected but was
<["data 0"]>
diff:
["data 0", "data 1"]
===========================================================================================
Finished in 0.007253 seconds.
-------------------------------------------------------------------------------------------
1 tests, 1 assertions, 1 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
0% passed
-------------------------------------------------------------------------------------------
137.87 tests/s, 137.87 assertions/s
Mac OS X (10.11.16)
17. class MyTest < ::Test::Unit::TestCase
test 'yay 1' do
data = []
thr = Thread.new do
data << "line 1"
end
data << "line 2"
assert_equal ["line 1", "line 2"], data
end
end
class MyTest < ::Test::Unit::TestCase
test 'client sends 2 data' do
list = []
thr = Thread.new do # Mock server
TCPServer.open("127.0.0.1", 2048) do |server|
while sock = server.accept
list << sock.read.chomp
end
end
end
2.times do |i|
TCPSocket.open("127.0.0.1", 2048) do |client|
client.write "data #{i}"
end
end
assert_equal(["data 0", "data 1"], list)
end
end
18. class MyTest < ::Test::Unit::TestCase
test 'yay 1' do
data = []
thr = Thread.new do
data << "line 1"
end
data << "line 2"
assert_equal ["line 1", "line 2"], data
end
end
class MyTest < ::Test::Unit::TestCase
test 'client sends 2 data' do
list = []
thr = Thread.new do # Mock server
TCPServer.open("127.0.0.1", 2048) do |server|
listening = true
while sock = server.accept
list << sock.read.chomp
end
end
end
2.times do |i|
TCPSocket.open("127.0.0.1", 2048) do |client|
client.write "data #{i}"
end
end
sleep 1
assert_equal(["data 0", "data 1"], list)
end
end
19. Loaded suite example
Started
.
Finished in 1.002745 seconds.
--------------------------------------------------------------------------------------------
1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
100% passed
--------------------------------------------------------------------------------------------
1.00 tests/s, 1.00 assertions/s
Mac OS X (10.11.16)
20. Loaded suite example
Started
E
=================================================================================================
Error: test: client sends 2 data(MyTest): Errno::ECONNREFUSED: Connection refused - connect(2)
for "127.0.0.1" port 2048
example.rb:16:in `initialize'
example.rb:16:in `open'
example.rb:16:in `block (2 levels) in <class:MyTest>'
example.rb:15:in `times'
example.rb:15:in `block in <class:MyTest>'
=================================================================================================
Finished in 0.005918197 seconds.
-------------------------------------------------------------------------------------------------
1 tests, 0 assertions, 0 failures, 1 errors, 0 pendings, 0 omissions, 0 notifications
0% passed
-------------------------------------------------------------------------------------------------
168.97 tests/s, 0.00 assertions/s
Linux (Ubuntu 16.04)
21. class MyTest < ::Test::Unit::TestCase
test 'yay 1' do
data = []
thr = Thread.new do
data << "line 1"
end
data << "line 2"
assert_equal ["line 1", "line 2"], data
end
end
class MyTest < ::Test::Unit::TestCase
test 'client sends 2 data' do
list = []
thr = Thread.new do # Mock server
TCPServer.open("127.0.0.1", 2048) do |server|
listening = true
while sock = server.accept
list << sock.read.chomp
end
end
end
2.times do |i|
TCPSocket.open("127.0.0.1", 2048) do |client|
client.write "data #{i}"
end
end
sleep 1
assert_equal(["data 0", "data 1"], list)
end
end
22. class MyTest < ::Test::Unit::TestCase
test 'yay 1' do
data = []
thr = Thread.new do
data << "line 1"
end
data << "line 2"
assert_equal ["line 1", "line 2"], data
end
end
class MyTest < ::Test::Unit::TestCase
test 'client sends 2 data' do
list = []
thr = Thread.new do # Mock server
TCPServer.open("127.0.0.1", 2048) do |server|
listening = true
while sock = server.accept
list << sock.read.chomp
end
end
end
2.times do |i|
TCPSocket.open("127.0.0.1", 2048) do |client|
client.write "data #{i}"
end
end
sleep 1
assert_equal(["data 0", "data 1"], list)
end
end
23. class MyTest < ::Test::Unit::TestCase
test 'yay 1' do
data = []
thr = Thread.new do
data << "line 1"
end
data << "line 2"
assert_equal ["line 1", "line 2"], data
end
end
class MyTest < ::Test::Unit::TestCase
test 'client sends 2 data' do
list = []
thr = Thread.new do # Mock server
TCPServer.open("127.0.0.1", 2048) do |server|
listening = true
while sock = server.accept
list << sock.read.chomp
end
end
end
2.times do |i|
TCPSocket.open("127.0.0.1", 2048) do |client|
client.write "data #{i}"
end
end
sleep 1
assert_equal(["data 0", "data 1"], list)
end
end
24. class MyTest < ::Test::Unit::TestCase
test 'yay 1' do
data = []
thr = Thread.new do
data << "line 1"
end
data << "line 2"
assert_equal ["line 1", "line 2"], data
end
end
require 'socket'
class MyTest < ::Test::Unit::TestCase
test 'client sends 2 data' do
list = []
listening = false
thr = Thread.new do # Mock server
TCPServer.open("127.0.0.1", 2048) do |server|
listening = true
while sock = server.accept
list << sock.read.chomp
end
end
end
sleep 0.1 until listening
2.times do |i|
TCPSocket.open("127.0.0.1", 2048) do |client|
client.write "data #{i}"
end
end
require 'timeout'
Timeout.timeout(3){ sleep 0.1 until list.size >= 2 }
assert_equal(["data 0", "data 1"], list)
end
end
25. *NIX and Windows:
fork-exec and spawn
• Windows: another thread scheduling :(
• daemonize:
• double fork (or Process.daemon) on *nix
• spawn on Windows
• Execute one another process:
• fork & exec on *nix
• spawn on Windows
• CI on Windows: AppVeyor
28. Memory Usage:
Object leak
• Temp values must leak in
long running process
• 1,000 objects / hour
=> 8,760,000 objects / year
• Some solutions:
• In-process GC
• Storage with TTL
• (External storages: Redis, ...)
module MyDaemon
class Process
def hour_key
Time.now.to_i / 3600
end
def hourly_store
@map[hour_key] ||= {}
end
def put(key, value)
hourly_store[key] = value
end
def get(key)
hourly_store[key]
end
# add # of data per hour
def read_data(table_name, data)
key = "records_of_#{table_name}"
put(key, get(key) + data.size)
end
end
31. Formatting Data Into JSON
• Fluentd handles JSON in many use cases
• both of parsing and generating
• it consumes much CPU time...
• JSON, Yajl and Oj
• JSON: ruby standard library
• Yajl (yajl-ruby): ruby binding of YAJL (SAX-based)
• Oj (oj): Optimized JSON
32. class MyTest < ::Test::Unit::TestCase
test 'yay 1' do
data = []
thr = Thread.new do
data << "line 1"
end
data << "line 2"
assert_equal ["line 1", "line 2"], data
end
end
require 'json'; require 'yajl'; require 'oj'
Oj.default_options = {bigdecimal_load: :float, mode: :compat, use_to_json: true}
module MyDaemon
class Json
def initialize(mode)
klass = case mode
when :json then JSON
when :yajl then Yajl
when :oj then Oj
end
@proc = klass.method(:dump)
end
def dump(data); @proc.call(data); end
end
end
require 'benchmark'
N = 500_000
obj = {"message" => "a"*100, "100" => 100, "pi" => 3.14159, "true" => true}
Benchmark.bm{|x|
x.report("json") {
formatter = MyDaemon::Json.new(:json)
N.times{ formatter.dump(obj) }
}
x.report("yajl") {
formatter = MyDaemon::Json.new(:yajl)
N.times{ formatter.dump(obj) }
}
x.report("oj") {
formatter = MyDaemon::Json.new(:oj)
N.times{ formatter.dump(obj) }
}
}
33. $ ruby example2.rb
user system total real
json 3.870000 0.050000 3.920000 ( 4.005429)
yajl 2.940000 0.030000 2.970000 ( 2.998924)
oj 1.130000 0.020000 1.150000 ( 1.152596)
# for 500_000 objects
Mac OS X (10.11.16)
Ruby 2.3.1
yajl-ruby 1.3.0
oj 2.18.0
34. Speed is not only thing:
APIs for unstable I/O
• JSON and Oj have only ".load"
• it raises parse error for:
• incomplete JSON string
• additional bytes after JSON string
• Yajl has stream parser: very useful for servers
• method to feed input data
• callback for parsed objects
35. class MyTest < ::Test::Unit::TestCase
test 'yay 1' do
data = []
thr = Thread.new do
data << "line 1"
end
data << "line 2"
assert_equal ["line 1", "line 2"], data
end
end
require 'oj'
Oj.load('{"message":"this is ') # Oj::ParseError
Oj.load('{"message":"this is a pen."}') # => Hash
Oj.load('{"message":"this is a pen."}{"messa"') # Oj::ParseError
36. Speed is not only thing:
APIs for unstable I/O
• JSON and Oj have only ".load"
• it raises parse error for:
• incomplete JSON string
• additional bytes after JSON string
• Yajl has stream parser: very useful for servers
• method to feed input data
• callback for parsed objects
37. class MyTest < ::Test::Unit::TestCase
test 'yay 1' do
data = []
thr = Thread.new do
data << "line 1"
end
data << "line 2"
assert_equal ["line 1", "line 2"], data
end
end
require 'yajl'
parsed_objs = []
parser = Yajl::Parser.new
parser.on_parse_complete = ->(obj){ parsed_objs << obj }
parse << '{"message":"aaaaaaaaaaaaaaa'
parse << 'aaaaaaaaa"}{"message"' # on_parse_complete is called
parse << ':"bbbbbbbbb"'
parse << '}' # on_parse_complete is called again
38. class MyTest < ::Test::Unit::TestCase
test 'yay 1' do
data = []
thr = Thread.new do
data << "line 1"
end
data << "line 2"
assert_equal ["line 1", "line 2"], data
end
end
require 'socket'
require 'oj'
TCPServer.open(port) do |server|
while sock = server.accept
begin
buf = ""
while input = sock.readpartial(1024)
buf << input
# can we feed this value to Oj.load ?
begin
obj = Oj.load(buf) # never succeeds if buf has 2 objects
call_method(obj)
buf = ""
rescue Oj::ParseError
# try with next input ...
end
end
rescue EOFError
sock.close rescue nil
end
end
end
39. class MyTest < ::Test::Unit::TestCase
test 'yay 1' do
data = []
thr = Thread.new do
data << "line 1"
end
data << "line 2"
assert_equal ["line 1", "line 2"], data
end
end
require 'socket'
require 'yajl'
TCPServer.open(port) do |server|
while sock = server.accept
begin
parser = Yajl::Parser.new
parser.on_parse_complete = ->(obj){ call_method(obj) }
while input = sock.readpartial(1024)
parser << input
end
rescue EOFError
sock.close rescue nil
end
end
end
42. Thread in Ruby
• GVL(GIL): Giant VM Lock (Global Interpreter Lock)
• Just one thread in many threads can run at a time
• Ruby VM can use only 1 CPU core
• Thread in I/O is *not* running
• I/O threads can run in parallel
threads in I/O running threads
• We can write network servers in Ruby!
43. class MyTest < ::Test::Unit::TestCase
test 'yay 1' do
data = []
thr = Thread.new do
data << "line 1"
end
data << "line 2"
assert_equal ["line 1", "line 2"], data
end
end
class MyTestCase < ::Test::Unit::TestCase
test 'sent data should be received' do
received = []
sent = []
listening = false
th1 = Thread.new do
TCPServer.open("127.0.0.1", 2048) do |server|
listening = true
while sock = server.accepto
received << sock.read
end
end
end
sleep 0.1 until listening
["foo", "bar"].each do |str|
begin
TCPSocket.open("127.0.0.1", 2048) do |client|
client.write "data #{i}"
end
sent << str
rescue => e
# ignore
end
end
assert_equal sent, received
end
end
45. class MyTestCase < ::Test::Unit::TestCase
test 'sent data should be received' do
received = []
sent = []
listening = false
th1 = Thread.new do
TCPServer.open("127.0.0.1", 2048) do |server|
listening = true
while sock = server.accepto
received << sock.read
end
end
end
sleep 0.1 until listening
["foo", "bar"].each do |str|
begin
TCPSocket.open("127.0.0.1", 2048) do |client|
client.write "data #{i}"
end
sent << str
rescue => e
# ignore
end
end
assert_equal sent, received
end
end
46. class MyTestCase < ::Test::Unit::TestCase
test 'sent data should be received' do
received = []
sent = []
listening = false
th1 = Thread.new do
TCPServer.open("127.0.0.1", 2048) do |server|
listening = true
while sock = server.accepto
received << sock.read
end
end
end
sleep 0.1 until listening
["foo", "bar"].each do |str|
begin
TCPSocket.open("127.0.0.1", 2048) do |client|
client.write "data #{i}"
end
sent << str
rescue => e
# ignore
end
end
assert_equal sent, received
end
end
47. class MyTestCase < ::Test::Unit::TestCase
test 'sent data should be received' do
received = []
sent = []
listening = false
th1 = Thread.new do
TCPServer.open("127.0.0.1", 2048) do |server|
listening = true
while sock = server.accepto
received << sock.read
end
end
end
sleep 0.1 until listening
["foo", "bar"].each do |str|
begin
TCPSocket.open("127.0.0.1", 2048) do |client|
client.write "data #{i}"
end
sent << str
rescue => e
# ignore
end
end
assert_equal sent, received
end
end
48. class MyTestCase < ::Test::Unit::TestCase
test 'sent data should be received' do
received = []
sent = []
listening = false
th1 = Thread.new do
TCPServer.open("127.0.0.1", 2048) do |server|
listening = true
while sock = server.accepto
received << sock.read
end
end
end
sleep 0.1 until listening
["foo", "bar"].each do |str|
begin
TCPSocket.open("127.0.0.1", 2048) do |client|
client.write "data #{i}"
end
sent << str
rescue => e
# ignore
end
end
assert_equal sent, received
end
end
49. class MyTestCase < ::Test::Unit::TestCase
test 'sent data should be received' do
received = []
sent = []
listening = false
th1 = Thread.new do
TCPServer.open("127.0.0.1", 2048) do |server|
listening = true
while sock = server.accepto
received << sock.read
end
end
end
sleep 0.1 until listening
["foo", "bar"].each do |str|
begin
TCPSocket.open("127.0.0.1", 2048) do |client|
client.write "data #{i}"
end
sent << str
rescue => e
# ignore
end
end
assert_equal sent, received
end
end
50. class MyTestCase < ::Test::Unit::TestCase
test 'sent data should be received' do
received = []
sent = []
listening = false
th1 = Thread.new do
TCPServer.open("127.0.0.1", 2048) do |server|
listening = true
while sock = server.accepto
received << sock.read
end
end
end
sleep 0.1 until listening
["foo", "bar"].each do |str|
begin
TCPSocket.open("127.0.0.1", 2048) do |client|
client.write "data #{i}"
end
sent << str
rescue => e
# ignore
end
end
assert_equal sent, received # [] == []
end
end
51. Thread in Ruby:
Methods for errors
• Threads will die silently if any errors are raised
• abort_on_exception
• raise error in threads on main thread if true
• required to make sure not to create false success
(silent crash)
• report_on_exception
• warn errors in threads if true (2.4 feature)
52. class MyTestCase < ::Test::Unit::TestCase
test 'sent data should be received' do
received = []
sent = []
listening = false
th1 = Thread.new do
Thread.current.abort_on_exception = true
TCPServer.open("127.0.0.1", 2048) do |server|
listening = true
while sock = server.accepto
received << sock.read
end
end
end
sleep 0.1 until listening
["foo", "bar"].each do |str|
begin
TCPSocket.open("127.0.0.1", 2048) do |client|
client.write "data #{i}"
end
sent << str
rescue => e
# ignore
end
end
assert_equal sent, received # [] == []
end
end
53. Loaded suite example7
Started
E
===========================================================================================
Error: test: sent data should be received(MyTestCase):
NoMethodError: undefined method `accepto' for #<TCPServer:(closed)>
Did you mean? accept
example7.rb:14:in `block (3 levels) in <class:MyTestCase>'
example7.rb:12:in `open'
example7.rb:12:in `block (2 levels) in <class:MyTestCase>'
===========================================================================================
Finished in 0.0046 seconds.
-------------------------------------------------------------------------------------------
1 tests, 0 assertions, 0 failures, 1 errors, 0 pendings, 0 omissions, 0 notifications
0% passed
-------------------------------------------------------------------------------------------
217.39 tests/s, 0.00 assertions/s
sleeping = false
Thread.abort_on_exception = true
Thread.new{ sleep 0.1 until sleeping ; raise "yay" }
begin
sleeping = true
sleep 5
rescue => e
p(here: "rescue in main thread", error: e)
end
p "foo!"
54. Thread in Ruby:
Process crash from errors in threads
• Middleware SHOULD NOT crash as far as possible :)
• An error from a TCP connection MUST NOT crash the
whole process
• Many points to raise errors...
• Socket I/O, Executing commands
• Parsing HTTP requests, Parsing JSON (or other formats)
• Process
• should crash in tests, but
• should not in production
55. Thread in Ruby:
What needed in your code about threads
• Set Thread#abort_on_exception = true
• for almost all threads...
• "rescue" all errors in threads
• to log these errors, and not to crash whole process
• "raise" rescued errors again only in testing
• to make tests failed for bugs
58. Writing Middleware:
• Taking care about:
• various platforms and environment
• Resource usage and stability
• Requiring to know about:
• Ruby's features
• Ruby VM's behavior
• Library implementation
• In different viewpoint from writing applications!
59. Write your code,
like middleware :D
Make it efficient & stable!
Thank you!
@tagomoris
60. Loaded suite example
Started
F
===========================================================================================
Failure: test: client sends 2 data(MyTest)
example.rb:22:in `block in <class:MyTest>'
19: end
20: end
21:
=> 22: assert_equal(["data 0", "data 1"], list)
23: end
24: end
<["data 0", "data 1"]> expected but was
<["data 0", "data 1"]>
diff:
["data 0", "data 1"]
===========================================================================================
Finished in 0.009425 seconds.
-------------------------------------------------------------------------------------------
1 tests, 1 assertions, 1 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
0% passed
-------------------------------------------------------------------------------------------
106.10 tests/s, 106.10 assertions/s
Mac OS X (10.11.16)
61. Memory Usage:
Memory fragmentation
• High memory usage, low # of objects
• memory fragmentation?
• glibc malloc: weak for fine-grained memory allocation
and multi threading
• Switching to jemalloc by LD_PRELOAD
• FreeBSD standard malloc (available on Linux)
• fluentd's rpm/deb package uses jemalloc in default
62. abort_on_exception in detail
• It doesn't abort the whole process, actually
• it just re-raise errors in main thread
sleeping = false
Thread.abort_on_exception = true
Thread.new{ sleep 0.1 until sleeping ; raise "yay" }
begin
sleeping = true
sleep 5
rescue => e
p(here: "rescue in main thread", error: e)
end
p "foo!"
$ ruby example.rb
{:here=>"rescue in main thread", :error=>#<RuntimeError: yay>}
"foo!"