Process Control DaemonFor Embedded Linux Platforms Speaker: Hai Shalom rt-embedded.com/pcd
Background review: What were the reasons that led to the development of PCD. PCD project review: Features and high level overview of the project. Live demonstration. Q & A. Agenda
Does your product have a process controller? Does your product automatically recover after a crash? Do you think your product’s boot time is fast enough? Are you using methods other than printf to debug a crashed application? Are you familiar with all the processes which are running in your product and their dependencies? Some questions
Most of you probably answered “No” to at least one question. People who answered “Yes” to all questions are probably using PCD already! Let’s review some facts about Embedded Linux based products… What were your answers?
Done by scripts (rcS, rc.*). These are great, but might be:
Not optimal for embedded / not deterministic:
Limited ways to synchronize depended processes(delay).
Limited ways to verify successful start of a process
What is PCD? PCD – Process Control Daemon, is an open source, light-weight system level process manager for Embedded-Linux based products (consumer electronics, network devices, etc). The PCD provides a complementary service for any Embedded Linux driven product. Designed and implemented by Hai Shalom during employment at Texas Instruments for Next-Gen Puma5 Cable chipset. Released to open source as part of his M.Sc. Degree research. PCD is a proven solution that already drives millions of devices in the world.
System startup: PCD starts up the system in an efficient, synchronized and deterministic manner. Process management: a centralized entity that controls and monitors all processes, and provides API to manage them. System recovery: Configurable per process recovery action is taken in case of a crash. Debug information: PCD provides a detailed crash log in case of a program error. PCD Features in high-level
How does it work? What are the advantages of products with PCD?
Rule blocks replace/extend traditional shell scripts. Each rule defines a single process. Rule inter-dependency is well defined. PCD Scripts: Rule blocks Process 1 Rule 1 PCDScriptFile Process 2 Rule 2 Process 3 Rule 3
Very simple and readable syntax. Easy to extend and maintain. Each Rule block is based on the same template and contains the following details:
Event Driven System Startup Once all rules are parsed, the PCD builds a dependency graph database. PCD starts each rule in the “right” time. PCD continuously monitors the system. PCD Rule Rule Rule Rule Rule Rule Rule Rule Rule Rule Last
A Completion event of one rule could be the Start event of another rule.
Event Driven System Startup
Dependencies between processes are well defined. Rules are started as soon as their start event comes. No need for non-deterministic delays between starting processes. Rules without inter-dependency are started in parallel. Improve user experience and product reputation (Fast product!) Reduced startup time
Enhanced stability and robustness Crash Process Signal PCD Rule Restart Recover Rule Ignore Reboot
Enhanced stability and robustness Enhanced monitoring on processes and recovery in case of failure. Each Rule defines what to do in case its process crashes:
Restart the process: Usually for non-critical services such as a web server, or processes that can recover by restarting themselves.
Reboot the system: In case of a fatal, non-recoverable error.
Enhanced debugging capabilities Crash Signal PCD API Process PCD Rule Prepare and send exception info Detailed Crash Log Log in NVRAM
Enhanced debugging capabilities The PCD exception handlers will catch and handle any fault exception (Signals). The PCD will provide useful debug information. The information speeds up the error fixing cycle and improves product robustness. Error logs are saved in non-volatile memory
Can be used for offline analysis after a validation cycle in the lab.
Can be used for post-mortem analysis of units from the field.
Registers pc and lr/ra can be used to trace the bug using addr2line or objdump.
Crash log with PCD
Process management New Configuration Request to restart Process 2 Process 1 Process 2 Process 2 Restart Process 2 Rule 1 Rule 2 PCD Rule 4 Rule 3 User input: Disable something Process 3 Process 4 Request to terminateProcess 4 TerminateProcess 4
The PCD API is available by linking with the PCD library. Process management
What is the order that the processes are started? What is each process dependency? PCDcan generate dependency graphs for visual representation of all the rules and their dependencies. Visibility provides an excellent means to examine and understand the dependencies between each rule in the system, and fix them in case of mistakes. Dependency graph generation
PCD is architecture agnostic, except for the crash log code that displays register details. Up to date, the following platforms are supported:
The PCD Project is an Open-Source project. The PCD project is licensed under the GNU Lesser General Public License version 2.1, as published by the Free Software Foundation. Its license allows linking proprietary software without any license contamination. To view a copy of this license, visithttp://www.gnu.org/licenses/lgpl-2.1.html#SEC1 Licensing
PCD contribution to product success PCD improved the Puma5 products in various aspects:
Startup time: The system boots much more quickly comparing to scripts (15 seconds faster).
Robustness, availability: Due to the recovery actions, the system is more available and user experience is better.
Quality: Detailed crash logs pointed out bugs, reduced fix time, enabled remote and offline analysis.
Added new rule blocks with their own modifications.
PCD Home page (Hai’s Real-Time Embedded blog): http://www.rt-embedded.com/pcd Project management and source code at SourceForge: http://sourceforge.net/projects/pcd/ PCD Documentation and user guides (Yes! There is some): http://www.rt-embedded.com/blog/pcd-process-control-daemon/pcd-documentation/ PCD support forum: http://sourceforge.net/projects/pcd/support New software engineers are welcomed to join the project and contribute. PCD Resources
System startup: PCD starts up the system in an efficient, synchronized and deterministic manner. Process management: a centralized entity that controls and monitors all processes, and provides API to manage them. System recovery: Configurable per process recovery action is taken in case of a crash. Debug information: PCD provides a detailed crash log in case of a program error. PCD can make your product a better product! PCD Features in high-level
The PCD API provides an easy interface to request various services from the PCD: Start or terminate a process. Send a “process ready” event. Signal a process. Register to PCD default exception handlers. Reboot the system (with logged a reason). The PCD API is available by linking with the PCD library. Standard API for PCD services
PCD Exception handler Every program can register to PCD’s exception handlers. The PCD performs as a “crash daemon” which listens on a dedicated socket. The exception handler collects debug information and sends it to the PCD using only “Safe functions”. The PCD formats the data, displays it on the console and logs it in the non-volatile storage.
The PCD design features various loosely coupled software modules: Main: Performs the initializations and the main loop. Rule Parser: Reads and parses the textual rules. Rules DB: Stores all the rules as binary records. Process: Starts, stops and monitors the processes Timer: Provides the ticks for the pcd. Condition check: Checks if a condition is satisfied. Failure action: Performs failure/recovery actions. Exception: Implements the detailed exception handlers. API: The PCD API interface (As a separate library). PCD Software modules
PCD Software modules block diagram PARSER RULES DB Textual configuration file with rules Add Rule Rule Info OK/Fail Parse Rules File OK/Fail Activate Rules MAIN PCDAPI Check Messages Application Activate / Stop IPC Enqueue / Dequeue Rule Tick Crashed Iterate Check Condition Enqueue Rule TIMER PROCESS CONTRL CONDITIONCHECK EXCEPTION HANDLER OK / NOK Enqueue Process Spawn / Signal /Monitor Stopped / Signaled / Exited Activate failure action Activate Rule FAILUREACTION Process Activate failure action
A textual file, similar to shell script syntax. Contains a list of “Rule Blocks”. A Rule block is defined per process. Scripts can be extended by including other scripts. dividing dedicated scripts per each logical or functional sub-system in the system. PCD Rules Script
Rules and Processes block diagram PROCESS CONTROL PARSER RULESDB Started, Stopped, Monitored Add Rule Read Rule Process Associated PCD Script Rule Rule Rule … Rule Depends Rule Process Associated Started, Stopped, Monitored Depends Rule Process Associated Started, Stopped, Monitored
The PCD provides a parser which provides an easy way to verify that your PCD scripts do not contain syntax errors, similarly to compilation process. The parser allows to fix the configuration files on the host, without the need to run them on the target, and rebuilding an image in case of an error. Syntax Checking
No purchase costs or royalty fees. Source code is fully available. High quality code due to extensive exposure. LGPL allows linking proprietary code with PCD. Continuous development and bug fixes. Need a new feature? Either request it in the project tracker system Or join the PCD community and develop it, so other could also enjoy your productivity. PCD - Open Source Benefits
Support more platforms. Watchdog/Keep alive mechanism. Kernel monitoring agent/module. Rule enhancements: Affinity Resource limitation (CPU, Heap, Stack, Fork Bombs..) Current working directory Others… PCD – Wish list (Future Features)