Archivematica Technical Training Diagnostics Guide (September 2018)

Archivematica
Technical Training:
Diagnostics Guide
Ashley Blewer @ Artefactual Systems, Inc.
ablewer@artefactual.com

Outline
● Technical introduction
● Microservices we use
● Software components
● Logging in
● Reading logs
● System monitoring
● Upgrading/Security
● Advanced: APIs
● Getting (free) help

Supporting technology
● Python: programming language
● Django: web application framework
● Gearman: job scheduler
● MySQL: relational database
● Elasticsearch: search index
● Nginx: web server (can be apache)
● Gunicorn: interface between Python and Nginx
● git: version control system
● Ansible/Docker: deployment/configuration management

All on Linux
● Ubuntu 16.04 or 18.04
● CentOS 7 or Red Hat

Format Policy Registry
● Tools we use to perform preservation actions
● Rules we use to determine when to use the Tools
● Commands are applied to files based on the Rules

Tools
● Identification: FIDO and Siegfried
● Characterization: FITS, FFprobe, MediaInfo, ExifTool
● Event Detail: echo
● Extraction: 7zip, SleuthKit
● Normalization: FFmpeg, ImageMagick
● Transcription: Tesseract
● Validation: MediaConch
● Verification: File exists? Filesize is >0?

Production tools
Automation Tools https://github.com/artefactual/automation-tools/
Fixity https://github.com/artefactual/fixity
Deployment tools
Deploy-pub https://github.com/artefactual/deploy-pub
Ansible roles
https://github.com/artefactual-labs/ansible-archivematica-src
Am.git https://github.com/artefactual-labs/am
DevOps Tools
Ops-helpers https://github.com/artefactual-labs/ops-helpers
Dev-tools https://github.com/artefactual/archivematica-devtools

Technical stack
● Lots of tools = lots of potential points of failure
● Archivematica strives to relay as much information as
possible to the user -- especially about what the tools are
doing and what they are producing

Components
● Dashboard: for the user
● MCPClient: does the work
● MCPServer: manages the work
● Storage Service: manages storage

Logging in
● Logging in (ssh)
● Moving files (scp)
● What’s running (ps -sf | grep py)
● How much space? (du)
● How much free space? (df -h)
● Load average time? (top)
● Read end of logs (tail)
● Read logs (less)

Moving files
Download a file to your computer
scp
your_username@remotehost.url:your-file.txt
/your/local/directory
Send a file to your machine
scp path/to/your-file.txt
your_username@remotehost.url:/some/remote/di
rectory

What’s running?
ps -ef | grep py
These services should all be running:
● Dashboard (apache)
● Database (mysql)
● Elasticsearch (elastic)
● Storage Service (uwsgi or nginx)
● FITS
● Server (MCP) -- Should show MCP server and MCP client

What’s running?
ps -ef | grep py
Also, these dependent services should all be running:
● MySQL
● Elasticsearch
● Gearman
● Nginx
● Nailgun
● Clamav

du
To get the file size of each subdirectory of the directory you
are in, you can run this command:
du -h --max-depth=1
This command can take a long time if you have very large
mounted drives.
See amount of space on machine

Check free space on disk
df -h
● Up to 3x of free space required for processing
● cron job can auto-clear deleted/rejected files

Restarting services
service archivematica-dashboard restart
service archivematica-mcp-client.service restart
service archivematica-mcp-server.service restart
service archivematica-storage-service restart
service gearmand restart

Reading logs
less /var/log/archivematica/dashboard/dashboard.log
less /var/log/archivematica/dashboard/dashboard.debug.log
less /var/log/archivematica/MCPClient/MCPClient.log
less /var/log/archivematica/MCPClient/MCPClient.debug.log
less /var/log/archivematica/MCPServer/MCPServer.log
less /var/log/archivematica/MCPServer/MCPServer.debug.log
less /var/log/archivematica/storage-service/storage-service.log
less /var/log/archivematica/storage-service/storage-service.debug.log

Finding errors
grep -rn ERROR
/var/log/archivematica/

System monitoring
Zappix/Grafana

Upgrading
● Need to decide on a new release whether you want it or not, how much
time to put aside.
● Tradeoff to not upgrading is not keeping pace with community and having
a harder time getting support from community for an older version.
● Good idea to test the upgrade- make a backup of your production
environment and test upgrade there. If that is not possible, plan for
downtime.
○ If you want to be able to do this, you might want to explore
virtualization of your Archivematica environment so you can run a
development (testing) environment in addition to the production
environment.

Security upgrades
● Make sure that Ubuntu is set-up to do Unattended Upgrades, which will
apply security patches (like equivalent of Windows updates).
● Sometimes these upgrades require the system to be restarted- you might
need to plan for 30 minutes of downtime (not in the middle of processing,
make sure your current Transfer/AIPs are done).

APIs
https://wiki.archivematica.org/Storage_Service_API
https://wiki.archivematica.org/Archivematica_API

Getting Help
● Participating in the community forum
○ Archivematica
https://groups.google.com/forum/#!forum/archivematica
● Documentation
○ Main docs https://www.archivematica.org/en/
○ Wiki https://wiki.archivematica.org/Main_Page
● Github issues
○ Main repo https://github.com/archivematica/Issues/issues

See also
This presentation in document form
● For tech-savvy preservationists:
https://docs.google.com/document/d/1GybyH7X_gpZ7wpYVo5d9__LeG
NuXYCky0oairJGJAmo/edit#heading=h.y1nyq0vlcvsl
● For Archivematica-unfamiliar systems administrators:
https://docs.google.com/document/d/1NDzGHBGuPFa7GTHCMEl3D2n
vvdZRxG2FpdsGAYoG31I/edit#

exit
:q
q
Quit
.quit
q
:q!
Ashley Blewer @ Artefactual Systems, Inc.
ablewer@artefactual.com

Archivematica Technical Training Diagnostics Guide (September 2018)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Archivematica Technical Training Diagnostics Guide (September 2018)

Similar to Archivematica Technical Training Diagnostics Guide (September 2018) (20)

More from Artefactual Systems - Archivematica

More from Artefactual Systems - Archivematica (20)

Recently uploaded

Recently uploaded (20)

Archivematica Technical Training Diagnostics Guide (September 2018)