Red Hat Enterprise Linux Diagnostics and Troubleshooting Strategy

Home » Linux » Red Hat Enterprise Linux Diagnostics and Troubleshooting Strategy

This article describes effective troubleshooting methods and data collection strategies.

Use a systematic approach to troubleshooting with the scientific method.
Collect system information to support troubleshooting.
Use Red Hat resources to support troubleshooting.

Red Hat Enterprise Linux Diagnostics and Troubleshooting Strategy

Table of Contents

Using the Scientific Method
Defining Troubleshooting as a Scientific Method
Using the Scientific Method
Collecting Information to Support Troubleshooting
Locating Troubleshooting Information
systemd-journald Journal Files
rsyslog Log Files
Private Log Files
Enabling Verbose Information
Troubleshooting the Audit Log for SELinux Events
Troubleshooting with Red Hat Resources
Red Hat Customer Portal
Collecting Information for Red Hat Support
Searching the Knowledgebase
Managing Support Cases
Red Hat Customer Portal Tools
Red Hat Customer Portal Labs
Red Hat Insights
Summary

Using the Scientific Method

After completing this section, you should be able to use a systematic approach to troubleshooting with the scientific method.

Defining Troubleshooting as a Scientific Method

Efficient and timely troubleshooting skills can be developed through practice of the widely recognized scientific method. The scientific method is an empirical process for using logic to hypothesize and test theories through observation, refined by experimentation and validation of deductions that are drawn from those hypotheses. With experience, scientific method users acquire knowledge and develop useful conclusions through refinement and elimination of tested hypotheses.

Many technical professionals solve problems by using past experience, having previously seen the same problem, or having been taught how to solve similar problems. Consequently, it is efficient to query colleagues and other knowledge sources about the scenario, after the problem is accurately defined. However, when a problem is difficult to define and is not recognized through experience, making unscientific guesses about the problem cause can waste significant time and effort.

The scientific method is a provable technique for resolving and fixing new and complex problems, but it might not be the quickest resolution for all scenarios. To solve technical problems, you must discover and gather information, discarding what does not fit observations, and produce logical conclusions to uncover root causes. The scientific method consists of these steps:

Collect relevant information.
Create an accurate problem statement.
Formulate testable hypotheses.
Test each hypothesis.
Record and analyze the test results.
Fix and verify the problem resolution.

Using the Scientific Method

In many scenarios, you might repeat the scientific method, either in whole or by iterating on certain steps, to discover and verify the root cause of the problem.

Step 1: Collect relevant information.

The first step, before theorizing a problem statement, is to collect reliable and factual information. Common reasons for failed troubleshooting include incomplete information or misunderstood problem observations. Start by asking questions of the person who reported the problem and other relevant users or support personnel. Focus on, and record in a readable form, the verifiable facts that are related to the problem. Avoid opinions and judgments, but allow for suggestions from others who have proven to be successful with similar troubleshooting.

Useful information can also be found in screen outputs, system and application log files, error messages, and diagnostic tools. Diagnostic or error messages can be entered in Internet search engines to locate reports or resolutions for similar problems.

When applications or systems are known to have previously worked properly, then logic dictates that something must have changed. Use file, application, or system validation or comparison tools to locate changed files or to compare an errant file or system to a known good file or system that is expected to be configured the same.

When you have a clear perception of the problem, try to reproduce the error or failure. Use verbose logging or tool diagnostic modes to provide additional information about the errant process or observed behavior.

Step 2: Create an accurate problem statement.

The process of creating a problem statement results in a specific definition of the problem in words, preferably written. Creating the statement as a grammatically correct and understood sentence contributes to accurately clarifying the problem. If you are unable to state the problem in a clear sentence that others agree defines that specific problem, then your problem statement needs work. An accurate problem statement explains the problem to be solved, such that a successful problem resolution is the inverse of that statement.

A problem statement includes answers to factual queries about the problem:

What specific system, application, process, or function, is failing, degraded, or down?
What actions or steps can reproduce the problem?
When was the problem first noticed or reported?
Where does the problem occur or where is the behavior observed?
Who experiences the problem? Not who reported it, but what is the scope of its effect?

If the problem is reproducible, the problem statement should include the steps that cause the problem to occur. Here are examples of well-defined problem statements:

Problem 1: Beginning last Friday, all marketing department users are reporting that they are unable to successfully launch or use the mail application, which displays the error message “Data store XYZ is not available.” The problem can be consistently reproduced by selecting the mail icon from any marketing department user’s menu.

Problem 2: Today, userX reported that they are unable to print from application ABC to printer123, but can print to printer123 from any other application on the same system.

When the problem is resolved and fixed, the result is the inverse of the original problem statement. For example, the inverse of the previous problem statements would be:

Result for problem 1: Currently, all marketing department users are able to successfully launch and use the mail application, without any displayed error messages.”

Result for problem 2: Currently, userX is able to print from application ABC to printer123, and can also print to printer123 from any other application on the same system.

Step 3: Formulate testable hypotheses.

By using the problem statement that you created and the information that you collected and recorded, formulate one or more hypotheses as to the cause of the problem. This step is significantly more productive when performed in a brainstorming group that is comprised of individuals who are capable of effectively using this scientific method.

Do not rush this step. Although any single hypothesis might seem promising, it is more efficient to formulate, at the same time, all possible, practical hypotheses about the cause of the problem. No relevant, sincere suggestion should be dismissed without being tested.

When formulating and recording each hypothesis, also record a validation test method for each. Although it might appear faster to jump to performing each test as each hypothesis occurs, it is more productive to stay in brainstorming mode until all ideas are exhausted and each hypothesis and test method is recorded in an organized, readable form. Here are examples of hypotheses and test methods for the mail application problem:

Data store XYZ is on a disk that failed. Test by locating data store XYZ and accessing other objects on that disk.
Data store XYZ is on a network share that no longer exists or works properly. Test by locating the share and accessing the share directly by using a proper client.
Data store XYZ is on a storage server that has stopped services or has frozen. Test by locating the correct server and accessing it with management tools.
Data store XYZ is on a storage server that cannot be reached due to a network problem. Test by locating the correct network and accessing the interfaces on that network by using network tools.

Step 4: Test each hypothesis.

Perform each of the tests that you recorded for your hypothesis. Prioritize the tests in the order that you or your group decide is the most likely to quickly find the problem’s root cause. Performing each test should result in either discovering the problem’s cause or in eliminating that hypothesis from your list.

If a test requires configuration changes or another form of system modification, follow this single, inviolate rule:

Only one change may be made during any single test run.

Never change more than one parameter at a time. If the test fails to verify the problem cause, reset that changed parameter to its original value, and then change only one new parameter before performing the next test run.

Record information about each test run, including the changed parameters, distinguishing test characteristics, and the observed result, in an organized, readable form. Failure to record each ongoing test result in an organized, readable form commonly creates an inability to distinguish or recall previous test results, which might cause you to need to repeat one or more tests from your hypotheses list.

Step 5: Record and analyze the test results.

During testing, you record the results of each test run, including any observed behaviors and relevant test characteristics. You should also record any new information that you collect that appears to relate to the problem. This information is useful for creating system reliability methods for permanently mitigating this problem scenario.

If the problem cause is not discovered after performing all the hypothesis tests in your list, you could decide to repeat this scientific method, including any newly collected information in the brainstorming process.

You could also decide that you or your group has insufficient knowledge of the application or systems that you are troubleshooting. Accordingly, you could choose to obtain further training or perform additional research before continuing to troubleshoot this problem. If fixing this problem is time-critical, you might need to escalate the problem to an appropriate higher level of support.

Step 6: Fix and verify the problem resolution.

If you discover the problem cause while performing a hypothesis test, you must still decide how to fix the problem. Some problems might require a temporary fix or workaround, with a permanent fix that is applied after preparation or during a maintenance window. Similar to the earlier steps, every change in your fix plan must include a validation test that conclusively proves that the change is working.

Fixes that you apply should follow the same inviolate rule used during hypothesis testing, that only one change can be made at a time and that the change must be validated before continuing with the next change. Red Hat recommends use of a change management system for applying and tracking changes, such as the Red Hat Ansible Automation Platform. Change management systems provide records that can verify earlier changes, and methods for accurately reverting changes or tested configurations.

After the temporary or permanent fix is applied, test the scenario again against the original problem statement. If the inverse of the problem statement is conclusively true, then you have successfully completed troubleshooting of your problem scenario.

Collecting Information to Support Troubleshooting

After completing this section, you should be able to collect system information to support troubleshooting.

Locating Troubleshooting Information

Successfully solving troubleshooting problems requires not only evaluating current software and hardware behavior, but also discovering previous events that occurred, especially those that relate to the reported failure or issue. Current system behavior is evaluated with diagnostic commands, performance tools, and live monitoring, as discussed in later chapters. Previous events are discovered by viewing the system’s journals and log files, as discussed in this section.

Journals and log files contain messages that are received from applications, services, and the kernel. Messages document events, errors, and configuration parameter changes as they occur. The following methods exist to store message information:

journald is a logging service that was introduced with the systemd service architecture.
rsyslog is a service that is designed to be a centralized message hub for other software to use.
Some applications write their own private log files and do not use journald or rsyslog.

systemd-journald Journal Files

The journald service writes binary-structured, memory-based log files, referred to as journals, that are viewed with the journalctl command. The journals are indexed for faster performance, and the binary format is more secure than plain text log files. The systemd-journald service manages the journals, rotating them automatically to limit their storage space and to reduce the need for maintenance.

Messages that journald obtains can be forwarded to rsyslog by using a memory-based socket file in /run/systemd/journal/. Current rsyslog implementations use an integrated module to access messages that the systemd journal generates instead of by monitoring the socket file.

Viewing Journal Log Files

The journalctl command with no options shows all collected journal entries.

[user@host ~]$ journalctl
-- Logs begin at Tue 2021-09-14 22:51:26 EDT, end at Wed 2021-09-15 02:01:01 EDT. --
Sep 14 22:51:26 localhost kernel: Linux version 4.18.0-305.el8.x86_64 ([email protected]) >
Sep 14 22:51:26 localhost kernel: Command line: BOOT_IMAGE=(hd0,gpt3)/boot/vmlinuz-4.18.0-305.el8.x86_64 root=/dev/vd>
...output omitted...

View journal entries that relate to a specific device, executable, or special file.

[user@host ~]$ journalctl /dev/vda
-- Logs begin at Tue 2021-09-14 22:52:15 EDT, end at Wed 2021-09-15 02:11:18 EDT. --
...output omitted...
Sep 14 22:52:17 localhost kernel: virtio_blk virtio2: [vda] 20971520 512-byte logical blocks (10.7 GB/10.0 GiB)

Filter the journal logs for messages that relate to a specified systemd service or other type of systemd unit.

[user@host ~]$ journalctl -b _SYSTEMD_UNIT=httpd.service
-- Logs begin at Tue 2021-09-14 22:52:15 EDT, end at Wed 2021-09-15 02:14:43 EDT. --
Sep 15 02:02:06 servera.lab.example.com httpd[5874]: Server configured, listening on: port 80

View logs for a single systemd service instance by using that daemon’s process ID (PID).

[user@host ~]$ journalctl -b _SYSTEMD_UNIT=httpd.service _PID=5874
-- Logs begin at Tue 2021-09-14 22:52:15 EDT, end at Wed 2021-09-15 02:15:28 EDT. --
Sep 15 02:02:06 servera.lab.example.com httpd[5874]: Server configured, listening on: port 80

Viewing Previous Boot Session Messages

Because the systemd-journald service is one of the first to start at boot, the journal logs contain the earliest events and messages that are generated during the boot process.

By default, journalctl displays messages since the most current boot only. To view messages from a session before the current boot, first locate the boot ID for the previous session.

The journal log files must be configured for persistent storage to be able to view previous boot session messages. Non-persistent journals are stored in the /run memory-mounted file system, and are lost when the system reboots.

[user@host ~]$ journalctl --list-boots
0 00552ebbabad42c3a439a303138914ee Tue 2021-09-14 22:52:15 EDT—Wed 2021-09-15 02:19:28 EDT
0 30cd7eb9116d4078a1f11c0fbeac8082 Wed 2021-09-29 00:26:09 EDT-Wed 2021-09-29 00:22:47 EDT

View information about a specified systemd service by using a boot ID from a previous boot session.

[user@host ~]$ journalctl --boot 00552ebbabad42c3a439a303138914ee _SYSTEMD_UNIT=httpd.service
-- Logs begin at Tue 2021-09-14 22:52:15 EDT, end at Wed 2021-09-15 02:20:28 EDT. --
Sep 15 02:02:06 servera.lab.example.com httpd[5874]: Server configured, listening on: port 80

Configuring journald for Persistent Storage

By default, Red Hat Enterprise Linux 8 stores the system journal logs in a ring-buffer in /run/log/journal.

Enable persistent journal storage by creating a disk directory and configuring the service to use it.

Step 1: Create the /var/log/journal directory.

[root@host ~]# mkdir /var/log/journal

Step 2: Edit the /etc/systemd/journald.conf file to enable the persistent storage policy.

[root@host ~]# sed -i 's/#Storage=auto/Storage=persistent/' /etc/systemd/journald.conf

Step 3: Restart the systemd-journal service to begin using the new persistent directory. If the configured directory does not exist or is not writable by systemd-journald, the service will continue to use the /run/log/journal location.

[root@host ~]# systemctl restart systemd-journald.service

rsyslog Log Files

The rsyslog service acts as a message processing hub. Applications can use system calls to send messages directly to rsyslog. The rsyslog service uses the imjournal input module to retrieve messages continuously from journald, and also forwards messages that it receives directly to journald by using the omjournal output module.

Messages that are sent to rsyslog include the facility and level fields, which are then used for sorting the messages into log files in the /var/log/ directory. The facility value indicates where the message originated, and the level value indicates the severity of the message. These facilities, among some others, send messages to rsyslog:

Kernel messages
User-level messages
Mail system
System daemons
Security and authorization messages
Messages that are generated internally by syslog

Locations of the syslog Log Files

The /etc/rsyslog.conf file contains rules for sorting incoming messages by facility and level, and storing them into specific log files. The following subdirectories under the /var/log directory contain rsyslog messages:

/var/log/messages stores all the syslog messages that are not explicitly configured for other rsyslog locations.
/var/log/secure stores security and authentication-related messages and errors.
/var/log/maillog stores mail server-related messages and errors.
/var/log/cron stores log files that relate to periodically executed tasks.
/var/log/boot.log stores log files that relate to system startup.

NOTE: You can access log files with the web console. Log in to the web console and then click Logs to view the default rsyslog log entries.

Private Log Files

Some applications maintain their own log files instead of using the rsyslog service. Those applications might need a custom message structure or might process their messages differently from rsyslog or journald. Typically, these log files are in subdirectories of /var/log. For example, the Apache Web Server on a RHEL server saves log messages to files under the /var/log/httpd directory. Samba is another service that uses private log files, with a default file location under the /var/log/samba directory.

Enabling Verbose Information

Many commands and services can increase the logging detail that they generate while running by including a log level option when they start. Examples of log level options include -v, -vv, -vvv, and –debug, which can be included in their startup configuration files. Refer to the documentation for individual services where you want to increase the verbosity or logging levels.

NOTE: Including a debug option for services that are configured under the /etc/sysconfig/ directory might cause that service to be unable to disconnect from its controlling terminal. When such services are started by using systemctl and the service type is forking, the systemctl command does not return until the service is passed an interrupt signal such as Ctrl-C. Alternatively, when troubleshooting, you can run the service manually from the command line with the debug option.

Troubleshooting the Audit Log for SELinux Events

The audit system security log file, at /var/log/audit/audit.log, contains audit event objects that relate to possible SELinux access denials. Since the file is securely stored in binary, inspect these audit records with a search utility. Use the ausearch tool to query the audit logs for SElinux events.

To resolve SELinux denials, first analyze the root cause as stored in the audit log. Use the sealert command from the policycoreutils-python-utils and setroubleshoot-server packages to help in resolving the problem. SELinux denials might be caused by an incorrect SELinux label, context, Boolean, or port number.

SELinux context issues also occur when a service uses non-standard directories. For example, use of a web server with the non-standard /home/user/myweb directory requires that directory to be set with the correct SELinux label.

The semanage fcontext command stores a specified SELinux context with the directory location in the SELinux database.

[root@host ~]# semanage fcontext -a -t httpd_sys_content_t "/home/user/myweb"

The restorecon command looks up a specified directory in the database, retrieves the configured context for that directory, and applies the context on the directory.

Use the following command to apply the context change:

[root@host ~]# restorecon -R -v /home/user/myweb

SELinux Booleans allow exceptions to runtime policy that can be enabled or disabled without restarting a service. In this example, enabling this Boolean enables access to the home directories through a local web server, which is normally a policy behavior that is allowed to web servers.

[root@host ~]# setsebool httpd_enable_homedirs on

Many services are allowed by policy to run only on specific port numbers. The semanage port command allows the service to operate through a non-standard port. In this example, the default port of a web service is changed to use port 9876.

[root@host ~]# semanage port -a -t http_port_t -p tcp 9876

REFERENCES

journalctl(1), systemd.journal-fields(7), and systemd-journald.service(8), auditd(8), ausearch(8), sealert(8) man pages.

For further information, refer to Chapter 10. Troubleshooting Problems Using Log Files

For further information, refer to Chapter 5. Troubleshooting Problems Related to SELinux

Troubleshooting with Red Hat Resources

After completing this section, you should be able to use Red Hat resources to support troubleshooting.

Red Hat Customer Portal

The Red Hat Customer Portal provides customers with access to their subscription benefits in one place. Customers can search for solutions, FAQs, Knowledgebase articles, and official product documentation. Many features are accessible to everyone, while some are exclusive to customers with active subscriptions.

At the Customer Portal, customers can manage Red Hat product subscriptions on registered systems, download software, upgrades, and evaluations, and submit and manage support cases for those systems. The Red Hat Customer Portal supports using the command-line redhat-support-tool to access the same services. Help for obtaining access is available here.

The Red Hat Knowledgebase.

Collecting Information for Red Hat Support

Red Hat Enterprise Linux includes the sos report tool in the sos package. This tool collects configuration details, log files, system information, diagnostic information, then combines the results in a tarball to attach to an open Red Hat Support case. The command can collect information from multiple systems. The tool requires root privileges.

The sos clean subcommand obfuscates potentially sensitive system information that is not removed by standard sos report postprocessing. This data includes IP addresses, networks, MAC addresses, and more.

When run without options, the sos report command prompts the user for a case number, and then collects a default set of files and settings, and creates the tarball. Options can enable or disable plug-ins, configure plug-in options, and make the process run without interaction.

Run sos report -l to view all plug-ins and those which are currently enabled, and configurable plug-in options. To enable additional plug-ins, use the -e ENABLE_PLUGINS option. To configure plug-in options, use the -k PLUGOPTS option. The -n SKIP_PLUGINS option disables unwanted plug-ins in the report execution.

Use the –encrypt-key and –encrypt-pass options with sos report command to generate encrypted reports.

NOTE: If the sos report command fails, you can collect the data manually. Follow the steps in Knowledgebase solution 68996. Sosreport fails. What data should I provide in its place?

Searching the Knowledgebase

The Red Hat Support Tool utility redhat-support-tool provides a text-console interface to the Red Hat Access subscription services. Internet access is required. The redhat-support-tool is a text-based tool, which can be run using SSH or from any terminal.

The redhat-support-tool can use an interactive shell or be invoked as individual commands with options and arguments. The syntax is identical for both methods. By default, the program launches in shell mode. Use the help subcommand to view available subcommands. Shell mode supports tab completion and calling other programs from the parent shell.

[user@host ~]$ redhat-support-tool
Welcome to the Red Hat Support Tool.
Command (? for help):

When first invoked, redhat-support-tool prompts for the required Red Hat Access subscriber login information. To avoid repetitively supplying this information, the tool asks to store account information in the user’s home directory, ~/.redhat-support-tool/redhat-support-tool.conf. If many users share a Red Hat Access account, the –global option can save account information to /etc/redhat-support-tool.conf, along with other systemwide configuration. The tool’s config command modifies tool configuration settings.

Subscribers can use the redhat-support-tool to search and display the same Knowledgebase content as on the Red Hat Customer Portal. Knowledgebase permits keyword searches, similar to the man command. Users can enter error codes, syntax from log files, or any mix of keywords to produce a list of relevant solution documents.

The following output is an initial configuration and basic search demonstration:

[user@host ~]$ redhat-support-tool
Welcome to the Red Hat Support Tool.
Command (? for help): search How to manage system entitlements with subscription-manager
Please enter your RHN user ID: subscriber
Save the user ID in /home/user/.redhat-support-tool/redhat-support-tool.conf (y/n): y
Please enter the password for subscriber: password
Save the password for subscriber in /home/user/.redhat-support-tool/redhat-support-tool.conf (y/n): y

After prompting for the user configuration, the tool continues with the search request:

Type the number of the solution to view or 'e' to return to the previous menu.
1 [ 253273:VER] How to register and subscribe a system to the Red Hat Customer Portal using Red Hat Subscription-Manager
2 [3121571:VER] How to register and subscribe a system offline to the Red Hat Customer Portal?
2 of 50 solutions displayed. Type 'm' to see more, 'r' to start from the beginning again, or '?' for help with the codes displayed in the above output.
Select a Solution: 1

You can select specific sections of solution documents for viewing.

Type the number of the section to view or 'e' to return to the previous menu.
1 Title
2 Issue
3 Environment
4 Resolution
5 Display all sections
End of options.
Section: 1

Title
==========================================================================
How to register and subscribe a system to the Red Hat Customer Portal using Red Hat Subscription-Manager
URL:  https://access.redhat.com/solutions/253273
Created On: 2012-10-30T04:24:25-04:00
Modified On: 2017-11-29T10:33:51-05:00
(END) q

Directly Access Knowledgebase Articles by Document ID

Locate online articles using the tool’s kb command with the Knowledgebase document ID. Documents scroll without pagination, allowing a user to redirect the output to other commands.

[user@host ~]$ redhat-support-tool kb 253273 | less

Title
==========================================================================
How to register and subscribe a system to the Red Hat Customer Portal using Red Hat Subscription-Manager
URL:         https://access.redhat.com/solutions/253273 

Issue
==========================================================================
* How to register a new `Red Hat Enterprise Linux` system to the Customer Portal using `Red Hat Subscription-Manager`
* How to un-register a system using `Red Hat Subscription-Manager`

: q

Documents that are retrieved in unpaginated format are easy to send to a printer, or convert to PDF or other document format. Documents can also be redirected to a data entry program for an incident tracking or change management system, by using other utilities that are installed and available in Red Hat Enterprise Linux.

Managing Support Cases

One subscription benefit is access to technical support through Red Hat Customer Portal. Depending on the system’s subscription support level, Red Hat may be contacted through online tools or by phone. See the Contacting Red Hat Technical Support for information about the support process.

Preparing a Bug Report

Before contacting Red Hat Support, gather relevant information for a bug report.

Define the problem. Clearly state the problem and its symptoms. Be as specific as possible. Detail the steps to reproduce the problem.
Gather background information. Which products and versions are affected? Be ready to provide relevant diagnostic information. The files can include the output of sos report, as discussed earlier in this section. For kernel problems, the files could include the system’s kdump crash dump or a digital photo of the kernel backtrace that is displayed on the monitor of a crashed system.
Determine the severity level. Red Hat uses four severity levels to classify issues. Urgent and High severity problem reports should be followed by a phone call to the relevant local support center.
- Urgent (Severity 1): A problem that severely impacts use of the software in a production environment (such as loss of production data, or production systems are not functioning). The situation halts business operations and no procedural workaround exists.
- High (Severity 2): A problem where the software is functioning, but use in a production environment is severely reduced. The situation is causing a high impact to portions of the business operations and no procedural workaround exists.
- Medium (Severity 3): A problem that involves partial, non-critical loss of use of the software in a production environment or development environment. For production environments, the situation has a medium-to-low impact on the business, but the business continues to function, including by using a procedural workaround. For development environments, the situation is causing the project to no longer continue or migrate into production.
- Low (Severity 4): A general usage question, reporting of a documentation error, or recommendation for a future product enhancement or modification. For production environments, the situation has a low-to-no impact on the business or on the performance or functionality of the system. For development environments, there is a medium-to-low impact on the business, but the business continues to function, including by using a procedural workaround.

Managing Bug Reports

Subscribers can create, view, modify, and close Red Hat Support cases with redhat-support-tool. When support cases are opened or maintained, users can include files or documentation, such as diagnostic reports (sos report). The tool uploads and attaches files to online cases. Case details including product, version, summary, description, severity, and case group may be assigned with command options or by allowing the tool to prompt for required information. In this example, the –product and –version options are specified, but redhat-support-tool could provide choices if the opencase command had not specified them.

[student@demo ~]$ redhat-support-tool
Welcome to the Red Hat Support Tool.
Command (? for help): opencase --product="Red Hat Enterprise Linux" --version="8.4"
Please enter a summary (or 'q' to exit): System fails to run without power
Please enter a description (Ctrl-D on an empty line when complete):
When the server is unplugged, the operating system fails to continue.

1   Low
2   Normal
3   High
4   Urgent
Please select a severity (or 'q' to exit): 4
Would you like to use the default (Ungrouped Case) Case Group (y/N)? : y
Would see if there is a solution to this problem before opening a support case? (y/N) N
--------------------------------------------------------------------
Support case 03022708 has successfully been opened.

Including Diagnostic Information to Support Cases

Including diagnostic information when you create a support case contributes to quicker problem resolution. The sos report command generates a compressed tarball archive of diagnostic information that is gathered from the running system. The redhat-support-tool prompts to include another tarball if an archive was created previously:

Please attach a SoS report to support case 03022708. Create a SoS report as
the root user and execute the following command to attach the SoS report
directly to the case:
 redhat-support-tool addattachment -c 03022708 <path to sosreport>

Would you like to attach a file to 03022708 at this time? (y/N) N
Command (? for help):

If an sos report archive is not already prepared, then an administrator can generate and attach one later, with the tool’s addattachment command. Subscribers can view, modify, and close support cases.

Command (? for help): listcases

Type the number of the case to view or 'e' to return to the previous menu.
 1 03022708 [Waiting on Red Hat ] [sev4] System fails to run without power
No more cases to display
Select a Case: 1

Type the number of the section to view or 'e' to return to the previous menu.
 1 Case Details
 2 Modify Case
 3 Description
 4 Get Attachment
 5 Add Attachment
 6 Add Comment
End of options.
Option: q

Select a Case: q

Command (? for help): q

[user@host ~]$ redhat-support-tool modifycase --status=Closed 03022708
[user@host ~]$

The Red Hat Support Tool has advanced application analytic capabilities. With kernel crash dump files, the redhat-support-tool command can create and extract a backtrace report of the active stack frames at the point of a crash dump, to provide onsite diagnostics and to open a support case.

The tool analyzes log files. With the tool’s analyze command, you can parse log files of many applications to recognize problem symptoms. You can use these log files to inspect and diagnose unexpected individual behaviors.

Red Hat Customer Portal Tools

The Red Hat Customer Portal home page provides a Tools menu to troubleshoot Red Hat products, making it easy to search for common issues and possible solutions.

The Red Hat Customer Support Troubleshoot tool, helps with searching for common solutions in the Knowledgebase.

Red Hat Customer Portal Labs

Red Hat Customer Portal Labs provide web-based applications to aid in configuration, deployment, security, and troubleshooting.

The Customer Portal Troubleshooting Labs include various analysis tools:

Red Hat I/O Usage Visualizer: Provides a visualize I/O usage report.
Red Hat Memory Analyzer: Provides a visualize memory usage report.
JVMPeg: Help to analyze JVM threads that are overworking a CPU above a specified threshold.
Log Reaper: The app provides log analysis that emphasizes identification of errors in log files, presented in a view tailored for each log type, with automatic solution recommendations, and targeted analysis.
Kernel Oops Analyzer: Help to diagnose a kernel fault by using the input of an error message or file that contains one or more kernel oops messages.

Red Hat Insights

Red Hat Insights is a hosted service for system administrators and managers to proactively manage their systems. Red Hat Insights (securely) uploads key information from a system to Red Hat, where it is analyzed, and a set of tailored recommendations are made. These recommendations can help to keep systems stable and performing, by spotting any potential issues and giving remediation advice before they can become larger problems that cause disruption.

The web interface outlines issues that can affect registered systems, ordered by severity and type. Administrators can drill down into issues for tailored recommendations. Administrators can also choose to permanently ignore certain rules.

Red Hat Insights Services: Red Hat Insights provides services to administer and monitor registered systems to troubleshoot and remediate identified issues.

Advisor: Advisor identifies known configuration risks in the operating system, underlying infrastructure, or workloads that impact performance, stability, availability, or security best practices.

Vulnerability: Vulnerability assesses, remediates, and reports on Common Vulnerabilities and Exposures (CVEs) that impact Red Hat Enterprise Linux environments in the cloud or on premises.

Compliance: Compliance analyzes the compliance level of a Red Hat Enterprise Linux environment to an OpenSCAP policy, based on the version of SCAP Security Guide (SSG), supported by Red Hat Enterprise Linux.

Patch: Patch determines which Red Hat product advisories apply to an organization’s specific Red Hat Enterprise Linux instances. The patch provides guidance for remediation either manually or via Ansible Playbooks for patching.

Drift: Drift compares systems to baselines, system histories, and to each other to troubleshoot or identify differences.

Policies: Organizations can use the Policies service to define and monitor policies that are important internally, with alerts for environments that are not aligned to a policy.

Inventory: The Inventory service lists all of the hosts that are registered to Insights.

Remediations: The Remediation service lists all remediation plans, concentrating them in one place for easy access and analysis. You can download the remediation plans from this service. Alternatively, if Smart Management with Cloud Connector is also configured, then you can execute the playbook directly from the Remediations service.

Subscription Watch: Subscription Watch provides unified reporting of Red Hat subscription usage for easier and more efficient management of subscriptions to Red Hat Enterprise Linux and Red Hat OpenShift Platform.

Registering Systems Using Red Hat Insights

To register systems with Red Hat Insights, follow these steps:

Step 1: Ensure that the insights-client package is installed:

[root@host ~]# yum install insights-client

Step 2: Register the system with Red Hat Insights.

[root@host ~]# insights-client --register
You successfully registered 4578cb7f-47a9-4203-bef4-42651a257984 to account 5662036.
Successfully registered host host.lab.example.com
Automatic scheduling for Insights has been enabled.
Starting to collect Insights data for host.lab.example.com
Uploading Insights data.
Successfully uploaded report from host.lab.example.com to account 5662036.
View the Red Hat Insights console at https://cloud.redhat.com/insights/

Immediately after registering a system, Insights data becomes available in the web interface.

REFERENCES

For more information, refer to the Generating sos Reports for Technical Support Guide
Red Hat Access: Red Hat Support Tool
Contacting Red Hat Technical Support
Help – Red Hat Customer Portal
Red Hat Customer Portal Labs
Red Hat Insights
sos(1) man page

Summary

In this article, you learned:

The scientific method is an efficient form of troubleshooting.
The journalctl command outputs system logs generated by systemd services and other units.
The sos report command gathers detailed system information into a tarball.
The redhat-support-tool command interfaces with Red Hat support services.
The Red Hat Customer Portal provides tools to diagnose and troubleshoot issues.
The Red Hat Insights service provides system analytic and remediation strategies.