API Primer — Logging — Holy grail of production applications
This would be a short post about application logging, pondered over it being a separate note or should it part of a miscellaneous note
However the importance of logging ensured it would get it’s own post.
Here are quick links to other posts about building APIs
And then logging…
Myth: Log a lot, log everything
Fact
- Minimal effective logs are needed
- Each log takes time, uncontrolled logging has performance implications, up to 10% performance implication have been noticed in performance tests
What not to log
- Entry/Exit from functions, if one really needs this, ensure it’s a debug log
- Ensure production is reporting only Info, Warn, Error, Fatal level logs (Note: The log levels vary based on the library and language)
Code for failure
That sounds very pessimistic, isn’t it?
Things go wrong and when they go wrong is when a developer’s true merit comes forth.
The logs should speak and tell us what exactly went wrong
Hardening of code
Yes like a server would be hardened for production, a code base should be hardened as well
What is it
- Log level control and right defaults (default to be info and above)
- Ensure every line of code is combed through for failure scenarios like NPE, Index out of bound so on
- Ensure precise logging in exception blocks
How and What to log
Exceptions
- Ensure we have a log.error or log.fatal or log.warn depending on application scenario
- Do take care to avoid duplicate logging, especially stack traces as it can get really cluttered when viewing logs
Important decisions / scenarios
- Ensure critical information or decision points in the code logic are logged to aid application debugging and understanding
- Examples: important counters, critical if else conditions
External API
- Ensure clear logs are maintained in case of any errors when invoking external APIs
- Examples of data to be logged: Status Code, response body and exceptions (must have for non 2xx status codes)
- Remember it is tough to debug dependencies and the more information available the easier it will be
Database
- Ensure DB exceptions are well logged with as much context as possible
- Ensure the attributes of SQLException are available in the log, debugging DB errors are painful
Other aspects
- Logs can be used for tracking various aspects like time taken, number of calls, retry count, etc. This is a contentious topic due to overlap of responsibility with other tools. Decision to be taken in context of your application
- Ensuring application logs are consistent would definitely aid in log analysis
Log Management
Most applications now log only to console and log management tools like ELK, EFK, LogDNA integrate and move the logs to the respective applications
In case applications are following file based logging, following are must have to avoid logs eating up disk space and being able to process logs especially when something goes wrong
- Log retention by time and size
- Log rotation
- Log compression
Examples
log.error(exception);
Doesn’t give any information as to what data caused this exception
log.error(“Failure”, exception);
A botched attempt to fix the previous log
log.error(“Failed to fetch ID:” + id + “ from database”, exception);
Now with this log we know, what data failed (id)and during which operation (database)
log.error(“Failed to call API”, exception);
Does not tell which API with what parameters, what was the response etc
log.error(“Failed to call API”, exception);
log.error(“Request:”, request);
log.error(“Response Status Code:”, statusCode);
log.error(“Response:”, response);
Now, we are logging all the information, however in separate logs.
These 4 logs can appear many lines away depending on the load on API server and the logs being printed across the application due to concurrent requests. Please avoid such logging and follow the example below
log.error(“Failed to call xyz API, Request:” + request + “, Response Status Code:” + statusCode + “, Response Body: “ + response);
Managing Exceptions
new Exception(“Failed to fetch ID:” + id + “ from database”);
Pass the exception along, do not loose it e.g. SQLException contains lots of information for us to decipher DB issues
new Exception(“Failed to fetch ID:” + id + “ from database”, originalException);
Ensures we have the complete context when we decide to log the exception
Conclusion
While this post has focused on logging for APIs and concurrency around it, the concepts are applicable to batch jobs or any program.