IP Monitoring & Diagnostics With Command Line Tools: Part 8 - Caching The Results
Storing monitoring outcomes in temporary cache containers separates the observation and diagnostic processes so they can run independently of the centralised marshalling and reporting process.
More articles in this series:
Maintaining security and integrity is important. Taking measurements in satellite-nodes and transferring them to a central-cortex are two separate activities. Measuring techniques might require higher levels of privilege than is necessary for simply aggregating results. Decouple them and run each one separately with just the right amount of permissions. This reduces the attack surface for possible intrusions.
Why caching is a good idea
Decouple measurement and aggregation to improve security. The simplified logic also reduces the risk of things going wrong. After the first run, a default result is always available, resulting in fewer errors (although it may not be up to date if the monitor is stalled).
There are four basic techniques:
- Record single measurements in a file
- Append observations to a log
- Capture measurements in a rotating buffer and vote on the result
- Store measurements in a database table
The observations take place in the satellite-nodes where they are cached and the central-cortex independently retrieves the results from the satellite caches when it needs them.
Caching single results in a file
Use this technique for storing the live result of a measurement. It might be a count of processes, open files or whether a server process is running or stopped.
The single caret (>) I/O redirection always overwrites older values with the latest result.
Create a cache container (my_cache) for your files in the /var/log folder.
Set appropriate file ownership and read access permissions as each file is created. Use the chown command to set the owner, the chgrp command to set the group and the chmod command to set the access permissions.
Beware that if you set the owner to be a different account and restrict the permissions, you may not be able to overwrite the file with a new measurement.
Include the monitoring and fetching accounts in the same group so they can share access. The chmod 640 value gives read-write access to the original file owner and allows other users in the same group to only read the file. Everyone else is denied access.
This example stores the measurement and sets up the file permissions:
echo "My test results" > /var/log/my_cache/result.dat
chmod 640 /var/log/my_cache/result.dat
The central-cortex would pull the file across like this:
scp {user}@{hostname}:/var/log/my_cache/result.dat {local_file_name}
Logging the output as a list
Use the double caret (>>) redirection to build up a time-based log of activity. Then analyse for trends after a historical data set is compiled.
Design a rigid convention for the format so that you can analyse the logs consistently later on. Each line should be constructed like this:
Item | Description |
---|---|
Date | ISO notation (YYYY-MM-DD) |
Time | Use 24-hour notation (HH:MM) |
Symbolic name | Identifies the measurement. Filter with this when different observations share the same log file. |
Status |
Indicates a routine value or an exception of some kind: • Info • Warning • Error • Fatal |
Description | A textual description of the log entry. |
Choose a unique character to separate each item.
Unattended log files grow very large. Running a scheduled job to compress and archive them every day keeps your system neat and tidy.
There are other system logs that you might find useful. Many of these also live in the /var/log folder. Some services store their logs differently but they are not hard to find.
Use a rotating buffer
Intermittent failures trigger false warnings if they are observed as a single event. Record half a dozen readings at one-minute intervals and count how many failures are captured. Trigger the warning when the vote is unanimous.
Trim the input file with a tail command whenever a new result is recorded. Redirecting an input file back to itself will destroy it because the empty output file is created first. Avoid this with the atomic file name technique.
echo {yes_or_no} >> rotating_buffer.dat
cat rotating_buffer.dat | tail -6 > rotating_buffer.dat_
mv rotating_buffer.dat_ rotating_buffer.dat
COUNT=$(cat rotating_buffer.dat | grep "NO" | wc -l | tr -d ' ')
if [ "${COUNT}" -eq "6" ]
then
echo "Six consecutive failures - Call for help"
fi
This was used in a high-availability server where a pager call was triggered only after six consecutive NO results. It prevented unwarranted call-outs for the engineers.
Use a database instead of a log file
A database is useful for recording a history of measurements to analyse trends over a very long time. This is better than log files because it avoids log rotation.
The operations team can set up a database for you. It needs to have a minimally privileged user account that allows remote access to write new data. The table configuration is done with a more powerful account.
The results table should have these columns:
Column | Description |
---|---|
KEY | A primary key identifies individual measurements so they can be accessed or edited. |
SYMBOL | Use the symbolic name for filtering. |
TIME_STAMP | The timestamp for the measurement supports trend analysis. Use the ISO date format: YYYY-MM-DD HH:MM:SS. |
DATA_TYPE | Describe the measurement using one of a limited set of symbolic data types. |
UNITS | Describe the units of measure because all the measurements will be collated in the same table. |
VALUE | The specific value of the measurement is stored separately to facilitate arithmetic operations. |
Use SQL from the command line
There are three useful techniques to understand when using a mysql command directly from inside a shell script to write data to the database:
• Direct execution of SQL queries from the command line
• Source running SQL queries from a separate file
• Running embedded SQL queries with input-redirection
Avoid storing account credentials in scripted commands, because they are visible in ps listings that can be viewed by other users.
Instead, create a file called .my.cnf in your home directory. Configure the database access credentials there without hard wiring them into the scripts. Note the leading dot on the custom config file name. Here is an example:
[client]
user = {db-user-name}
password = {password}
Note that this user name is an account within the database and not an operating system user account.
Where you would previously need to type a command like this:
mysql -u {db-user-name} -p {password}
Now, you only need to type the mysql command on its own without the account name and password.
Protect the file against intruders by setting the file ownership permissions with this command:
chmod 400 .my.cnf
Now it can only be read by the owning account.
If the database is running on a different machine, include the host and port details provided by your operations team. We will omit those in subsequent examples for simplicity:
mysql -h {hostname} -p {port number}
Omit the target database name from the configuration file to avoid associating everything with one single database for all tables.
Directly executing SQL from a shell script
Execute queries directly by adding the -execute flag followed by some SQL instructions. The -e flag is a useful abbreviation. This example displays the server version:
mysql -e STATUS | grep "^ Server version"
Source running SQL scripts
Encapsulate more complex queries into a separate SQL script file and redirect any messages into a log file:
mysql my_example_db < script.sql > output.log
Specify the target database name on the command line. Alternatively add a USE my_example_db instruction at the start of the SQL script. The query is now more robust because the target database is integral to the script.
USE my_example_db;
SHOW TABLES;
Redirecting embedded SQL to standard input
A here-document is an embedded stream of text that is redirected to the standard-input of a command. The redirection stops when the terminating tag is encountered.
#!/bin/sh
mysql my_database <<SQL_QUERY_SOURCE
SELECT COUNT(*)
FROM my_table_name
WHERE MEASUREMENT_SYMBOL="FILE_COUNT"
SQL_QUERY_SOURCE
Note that there must not be a space after the redirecting carets (<<) and the terminating tag must be spelled consistently.
Passing arguments from shell scripts
Modify the query source code with passed parameters from your script when you describe the SQL. This works with the -execute flag or a here document.
#!/bin/sh
PARAMETER="my_table_name"
mysql my_database << SQL_QUERY_SOURCE
SELECT COUNT(*) FROM $PARAMETER;
SQL_QUERY_SOURCE
Aggregating the results
The central-cortex can gather the results from the caches in the satellite-nodes. Or it can query the database for results that are recorded there. Any third-party systems can be accessed via HTTP and if necessary, their measurements can be written to the database or aggregated with the rest of the locally cached data in the cortex.
Conclusion
Caching makes our systems more robust and secure. The design becomes architecturally very simple for nodes that we own and build. The third-party machines should provide API access via HTTP as we discussed earlier.
The measurement data has a lifecycle like this:
• Measurements captured by privileged commands in satellite-nodes
• Cache long form results in data files
• Capture ongoing events into log files
• Capture state values to rotating buffers
• Write single measurements into a database
• Results gathered back to the central-cortex for aggregation
• Optional acknowledgement sent back to satellite-nodes in some way so they can garbage collect
• Access third party systems via HTTP from the central-cortex and if necessary, store those observations in a database
• Satellite-nodes clean up any temporary cached files that do not need to be retained
You might also like...
NDI For Broadcast: Part 1 – What Is NDI?
This is the first of a series of three articles which examine and discuss NDI and its place in broadcast infrastructure.
Brazil Adopts ATSC 3.0 For NextGen TV Physical Layer
The decision by Brazil’s SBTVD Forum to recommend ATSC 3.0 as the physical layer of its TV 3.0 standard after field testing is a particular blow to Japan’s ISDB-T, because that was the incumbent digital terrestrial platform in the country. C…
Designing IP Broadcast Systems: System Monitoring
Monitoring is at the core of any broadcast facility, but as IP continues to play a more important role, the need to progress beyond video and audio signal monitoring is becoming increasingly important.
Broadcasting Innovations At Paris 2024 Olympic Games
France Télévisions was the standout video service performer at the 2024 Paris Summer Olympics, with a collection of technical deployments that secured the EBU’s Excellence in Media Award for innovations enabled by application of cloud-based IP production.
Standards: Part 18 - High Efficiency And Other Advanced Audio Codecs
Our series on Standards moves on to discussion of advancements in AAC coding, alternative coders for special case scenarios, and their management within a consistent framework.