Why logging is important
Logging your code plays a very crucial role to support your service into production. Without enough logging, it becomes very difficult to debug issues into production.
The support team can find difficulties to support customer issues with a lack of logging. As a result, more problems to solve by the engineering team. Some of them can be related to invalid inputs or invalid data from customers. These problems are difficult to solve by the engineering team. Hence, loss in development productivity.
For example, on the e-commerce website, few customers reported that the Order Details page has issues with missing images, details, and tracking information. The sample logs recorded during Order details API call.
How to Design proper Instrumentation or logging
Various aspects need to consider while instrumenting your code. For instance, correlation logs, what technology to choose, data retention, costing, and many more. But some of them are mandatory and part of fundamental engineering. These require detailed planning and thought process ahead of time.
End to End Correlation
Each API request should have a unique identifier that should be tied up with all logs statements related to this request. Correlation id or request-id are common unique identifier. The Http request receives the unique identifier in its request header.
For instance, you can generate GUID into the client application and send this value as an HTTP header with any key name. For instance, x-correlation-id.
Contextual Log statements
Use contextual and simple log statements which will help the reader to understand the purpose of the log. For instance, while fetching order details API, you can see the following logs are easily readable.
Different types of logs
Common Log types are Informational, Verbose, Warning and Error.
|Informational (Info)||Record the general information talks about the progress and flow of the process. For instance, Start/stop of service, intermediate checkpoints, etc. This type should be default ON into Production systems.|
|Verbose||More than General Information which records granular steps. For instance, each iteration of for loops with different parameters logged, any other system information which is only needed in specific scenarios. This type of log is not recommended to keep ON. This should be enabled only in certain circumstances where Info logs are not enough to debug.|
|Warning||Any potential warning may cause side-effects, but it will not abort the execution of process. For instance, Shipment tracking information call is failing, due to which this information cannot be shown on the order details page. But other order details are available. This kind of information should be recorded as Warning.|
|Error||Any errors/exceptions which will abort the further execution of the process. Generally, this type of logs is recorded into a try-catch() block where exception handling is needed.|
Verbose level logging helps to discover internal issues that are not available with Informational logs. As a result, the support team can easily understand what is wrong with this order.
Avoid Sensitive / PII (Personal Identifiable Information)
Personal Identifiable Information (PII) is any information that can identify an individual. For instance, name, email address, telephone number, or social security number.
Due to Data Privacy and Compliance reasons never log any PII information of your users. You can read more about Data Privacy.
When logging API requests, ensure that the Authorization header is removed or replace the value with some static arbitrary value. In addition, if your API only works with Bearer token, then look for all HTTP header values, query string parameters. If any of value found with Bearer keyword, then replace this value. This approach will help if by mistake caller provided incorrect header key with the bearer token. As a result, a bearer token can be present in the logs. Hence, your logs can flag for non-compliance.
Application Monitoring Tools
These days various options available for instrumenting your code. For instance, Microsoft Application Insights, New Relic, AppDynamics any many more. To abstract your instrumentation technology, you can choose any of these tools which suit your requirements.
Anyone from the team should be able to identify the production issue in less than 5 minutes from logs. If your team can achieve this, then you are following best practices for instrumenting your code.
Happy Logging. 🙂