Jaffle: managing processes and log messages of multiple applicationsin development environment

Jaffle: managing processes and log
messages of multiple applications
in development environment
Masaki Yatsu
PyCon JP
Sep 18, 2018

Agenda
• Jaffle: Introducing new development tool
• Motivation
• Launching Python apps and external processes
/ Automatic job execution
• Log management in the development environment
• Managing Jaffle configurations
• Architecture of Jaffle
https://jaffle.readthedocs.io/
https://github.com/yatsu/jaffle
Jaffle

• Starts and stops multiple Python apps and external processes
• Auto task execution on detecting filesystem update (e.g. auto-testing)
• Integrated log output with filtering and replacing patterns
• A framework to integrate Python apps in a Jupyter kernel session
(described in the last section)
What is Jaffle?

Background
• There are some tools and services to manage
processes and logs in a production environment,
but there is no easy solution for a development
environment
• Recent years…
• Many applications and services must be integrated with
each other
• Each application and service outputs many and large
data

Launching Python apps and
external processes
/ Automatic job execution

Ex.1: Web dev with Tornado and React
• Launches the backend Web API server
• Launches two external processes
• “yarn start” (Frontend dev server)
• “jest” (JavaScript testing)
• On detecting .py file update
• Restart the Web API Server
• Execute pytest
• Displays integrated log messages
https://jaffle.readthedocs.io/en/latest/cookbook/tornado_spa.html
Backend development with Tornado (web framework)
Frontend development with React

Ex.2: Auto-pytest
• Executes test_*.py when it is updated
• Executes tests/foo/test_bar.py when the related
implementation foo/bar.py is updated
• Since pytest execution is always stand-by and
reloads only Python modules under the current
directory, it works fast
• Cache-clear strategy is configurable
https://jaffle.readthedocs.io/en/latest/cookbook/pytest.html

Ex.2: Auto-pytest / Config file
kernel "py_kernel" {}
app "watchdog" {
class = "jaffle.app.watchdog.WatchdogApp"
kernel = "py_kernel"
options {
handlers = [{
watch_path = "pytest_example"
patterns = ["*.py"]
ignore_directories = true
code_blocks = ["pytest.handle_watchdog_event({event})"]
}]
}
}
app "pytest" {
class = "jaffle.app.pytest.PyTestRunnerApp"
options {
args = ["-s", "-v", "—color=yes"]
auto_test = ["pytest_example/tests/test_*.py"]
auto_test_map {
"pytest_example/**/*.py" = "pytest_example/tests/{}/test_{}.py"
}
}
}
App that watches filesystem update
Executes the right-hand-side, when the left-hand-side is updated
Executes the matched test file
Defining Jupyter kernel (Python interpreter process)
Refer the variable defined in the kernel
Execute:
$ jaffle start jaffle.hcl
“jaffle.hcl” is optional:
$ jaffle start
Write jaffle.hcl as above:

Ex.3: Sphinx auto-build / Config file
kernel "py_kernel" {
pass_env = ["PATH"]
}
app "watchdog" {
options {
handlers = [{
patterns = ["*/docs/*.*"]
ignore_patterns = ["*/_build/*"]
jobs = ["sphinx”, “refresh”]
}]
}
}
job "sphinx" {
command = "sphinx-build -M html docs docs/_build"
}
job ”refresh" {
command = ”osascript browser_refresh.scpt"
}
sphinx-build is executed automatically
when .rst files under docs/ directory
are updated
Refers the job name
Defines a job for command execution
virtualenv requires this to pass environment
variables to the kernel session
tell application "Google Chrome" to tell the active tab of its first window
reload
end tell
(for macOS)

Ex.4: Jupyter Extension Development
https://jaffle.readthedocs.io/en/latest/cookbook/jupyter_ext.html
• Restarts Jupyter Notebook Server
when .py file is updated
• Executes “jupyter nbextension
install” when .js files is updated
kernel "py_kernel" {
pass_env = ["PATH"]
}
app "watchdog" {
options {
handlers = [
{
patterns = ["*.py"]
clear_cache = ["jupyter_myext"]
code_blocks = ["notebook.handle_watchdog_event({event})"]
},
{
patterns = ["*.js"]
jobs = ["nbext_install"]
},
]
}
}
app "notebook" {
class = "jaffle.app.tornado.TornadoBridgeApp"
options {
app_class = "notebook.notebookapp.NotebookApp"
}
start = "notebook.start()"
}
job "nbext_install" {
command = "jupyter nbextension install jupyter_myext --user --overwrite"
}
Development of both backend (sever
extension) and frontend (nbextension)
Jupyter Notebook Server can run in the Tornado IOLoop
which Jaffle initialized

Log management in the
development environment

Problems of log management
• Hard to understand the
processing flow from multiple
log files
• Hard to find a message
because there are so many log
messages
• Hard to get an information
from a log message because
some messages are large
Log integration
Filters out unnecessary log
messages
Extracts only the necessary
information and emphasize them
with colors

Log Integration
• All messages are displayed in a single flow
• Each logger (app or process name) has its own unique
color

Log Filtering
app "pytest" {
logger {
suppress_regex = [
"^platform ",
"^cachedir:",
"^rootdir:",
"^plugins:",
"collecting ...",
"^collected ",
]
}
}
Filters out unnecessary messages from pytest log

Problem of large data included in a message
$ kubectl get services kubernetes -o json
{
"apiVersion": "v1",
"kind": "Service",
"metadata": {
"creationTimestamp": "2018-09-06T13:09:41Z",
"labels": {
"component": "apiserver",
"provider": "kubernetes"
},
"name": "kubernetes",
"namespace": "default",
"resourceVersion": "18",
"selfLink":
"/api/v1/namespaces/default/services/kubernetes",
"uid": "1dc6394b-b1d6-11e8-859a-080027be08a0"
},
"spec": {
"clusterIP": "10.96.0.1",
"ports": [
{
"name": "https",
"port": 443,
"protocol": "TCP",
"targetPort": 8443
}
],
"sessionAffinity": "ClientIP",
"sessionAffinityConfig": {
"clientIP": {
"timeoutSeconds": 10800
}
},
"type": "ClusterIP"
},
"status": {
"loadBalancer": {}
}
}
service:
{"apiVersion":"v1","kind":"Service","metadata":{"creationTimestamp":"
2018-09-
06T13:09:41Z","labels":{"component":"apiserver","provider":"kubernete
s"},"name":"kubernetes","namespace":"default","resourceVersion":"18",
"selfLink":"/api/v1/namespaces/default/services/kubernetes","uid":"1d
c6394b-b1d6-11e8-859a-
080027be08a0"},"spec":{"clusterIP":"10.96.0.1","ports":[{"name":"http
s","port":443,"protocol":"TCP","targetPort":8443}],"sessionAffinity":
"ClientIP","sessionAffinityConfig":{"clientIP":{"timeoutSeconds":1080
0}},"type":"ClusterIP"},"status":{"loadBalancer":{}}}
It will be very large in a log message:
Someone requires only “labels”:
{"labels":{"component":"apiserver","provider":"kubernetes"}
But another one requires “name” and “clusterIP”:
service: {"name":"kubernetes”,"clusterIP":"10.96.0.1"}
Necessary information depends on each context.
It is hard to define multiple loggers.
It is hard to set log-level for each logger on launching an app.

Replacement and Extraction [1/4]
app "my_app" {
logger {
replace_regex = [
{
from = "^service: (.*)$"
to = ”service ip: ${jqf('.spec.clusterIP', '1')}"
},
]
}
}
Extract “.spec.clusterIP” from the dict
works like jq command for JSON
Extract the IP

app "my_app" {
logger {
replace_regex = [
{
from = "^service: (.*)$"
to = "ip: ${fg('blue')}${jqf('.spec.clusterIP', '1’)}${reset()}"
},
]
}
}
Colorize the string
Available Functions:
• jq_all(): Queries the dict and returns the list
• alias: jq()
• jq_first(): Queries the dict and return the first item
• alias: jqf()
• fg(): Sets the foreground color
• bg(): Sets the background color
• reset(): Reset the color

A simple configuration for debugging
app "my_app" {
logger {
replace_regex = [
{
from = "^XXX (.*)$"
to = "${fg('red')}XXX 1${reset()}"
},
]
}
}
Displays log messages in red
which has the label “XXX”

• Outputting large objects should be allowed in the debug log
• Use a tool to view the log with filters and replacements
• Filters can be configured in each context
• Jaffle supports merging multiple configurations
(described later)
Ideas to manage a large object in a message

Organizing Jaffle configurations

Setting Variables at runtime
• You can define variables in jaffle.hcl
• The variables can be set by environment
variables
variable "tornado_log_level" {
default = "debug"
}
variable "disable_frontend" {
default = false
}
app "tornado_app" {
class = "jaffle.app.tornado.TornadoBridgeApp"
start = "tornado_app.start()"
logger {
level = "${var.tornado_log_level}"
}
options {
app_class = "tornado_spa_advanced.app.ExampleApp"
}
}
process "frontend" {
command = "yarn start"
tty = true
disabled = "${var.disable_frontend}"
}
Reference
Reference
Definitions
$ J_VAR_tornado_log_level=info
J_VAR_disable_frontend=true
jaffle start
Execution:
You don’t need to rewrite jaffle.hcl everytime

Merging Configurations
Example directory structure
project_repo/
├─ jaffle.hcl
├─ frontend.hcl
├─ backend.hcl
├─ my-jaffle.hcl
└─ src/
Base configuration (managed by git)
Config for frontend dev (managed by git)
Config for backend dev (managed by git)
Personal config (Ignored by .gitignore)
$ jaffle start jaffle.hcl backend.hcl my-jaffle.hcl
Run Jaffle for backend development:
Merges into on configuration
Especially useful to applying multiple filtering configurations

Architecture and
Implementation of Jaffle

Architecture [1/2]
JaffleSessionManager
JaffleKernelManager
JaffleKernelClient
JaffleStartCommand
App
Jupyter Kernel Session
ZeroMQ Socket for logging
Python Code/Response
via ZeroMQ
Log Message
Create Kernel
Session
Using Jupyter packages as a framework for Python
execution and communication between apps
Python Code

Architecture [2/2]
• Runs multiple Python apps in a Jupyter kernel session
• Using KernelManager, ContentsManager, SessionManager to
manage kernel sessions
• Using jupyter-client to communicate with kernels
• Uses ZeroMQ to collect log messages
• Includes the following “Apps”
• WatchdogApp: Executes code blocks and jobs on detecting
filesystem update
• PyTestRunnerApp: pytest runner
• TornadoBridgeApp: Starts/stops Tornado app

Inter-App Communication
• WatchdogApp and PyTestRunnerApp
are assigned to variables “watchdog”
and “pytest” respectively
• Apps can communicate with each other
via those variables
• Python code can be written in jaffle.hcl
(“code_blocks” in the left example)
app "watchdog" {
options {
handlers = [{
# …
code_blocks = ["pytest.handle_watchdog_event({event})"]
}]
}
}
app "pytest" {
options {
# …
}
}
Refers the variable
watchdog is a variable
There is no special protocol between apps.
You only need to know the arguments
Write code
jaffle.hcl

Defining App
app "my_app" {
class = "my_module.MyApp"
options {
foo = 1
}
start = "my_app.bar()"
}
from jaffle.app.base import BaseJaffleApp, capture_method_output
from tornado import gen, ioloop
class MyApp(BaseJaffleApp):
def __init__(self, app_conf_data):
super().__init__(app_conf_data)
self.foo = self.options.get('foo')
self.log.info('foo: %d', self.foo)
@capture_method_output
def bar(self):
print('bar')
ioloop.IOLoop.current().add_callback(self.async_example)
@gen.coroutine
def async_example(self):
yield self.execute_code('{var} = 1 + 2', var='foo')
yield self.execute_command('echo hello')
yield self.execute_job('my_job')
Captures stdout/stderr
Getting an option
Call asynchrounously
my_app = my_module.MyApp(app_conf_data)
my_app.bar()
It will be executed as:
Usage (jaffle.hcl)
my_module.py

Extending Log Management
• Edit filtering rules interactively on the fly
• Preview the rules in real-time
• Saves the configurations created interactively
into a file
It is difficult to write regular expressions

Jaffle: managing processes and log messages of multiple applicationsin development environment

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Jaffle: managing processes and log messages of multiple applicationsin development environment

Similar to Jaffle: managing processes and log messages of multiple applicationsin development environment (20)

Recently uploaded

Recently uploaded (20)