When Convention Gets It Wrong Posted on June 10th, 2016
The other day, I needed to add some logging to a Guzzle HTTP request. During my testing phase I was seeing the ApiException
being triggered, but I never saw any data in the logs. Just by reading through this code, can you spot the error?
$response = $guzzle->send($request);
$decoded = json_decode($response->getBody()->getContents(), true);
if (json_last_error() !== JSON_ERROR_NONE) {
$this->logger->error('API Deserialization Failure', [
'serialization' => 'json',
'body' => $response->getBody()->getContents()
]);
throw new ApiException('Deserialization Failure');
}
If you guessed, the body
portion of the log will always be empty you're correct! For those of you who didn't, you're probably missing 2 important pieces of information.
- Guzzle follows PSR-7, which means
$response->getBody()
returns aStreamInterface
. -
getContents
method, in regards to streams, returns the unread portion of the stream.
getContents()
is not an idempotent method, meaning each time it is called its value may change. In this specific case, the first time it gets called it will return the entire response, while any subsequent call returns an empty string.
Why would Guzzle have a method that doesn’t seem to do what it says? This is because of the PSR-7 standard that sets conventions for HTTP message interfaces. Convention can be a powerful tool. It allows engineers to make assumptions about how something works. For example, an engineer can assume a POST HTTP request is mutating data and its GET counterparts retrieving data. It would also allow engineers to assume that StreamInterface::getContents()
would return the unread portion of the stream.
However, I believe that readability trumps convention. One of the most important goals of code is to clearly convey what it's doing. The best way to do this is with self documenting code. The above example reads like every time you call getContents()
you will, in fact, get the entire contents. In this case, it fails to accurately convey what is happening for two specific reasons. By adhering strictly to established conventions the code lost its ability to self document. Additionally, the focus on the inner workings of the PSR-7 standard neglected to consider how the interface would be consumed in a real use case.
For this specific example, only engineers who had previously worked with streams would assume getContents
only returned the unread portion of the stream. The remainder of engineers, and I assume the wider PHP community, most likely would assume it will return the contents of the response as a string. However, if the method was named getUnreadContents()
or perhaps getRemainingContents()
, the self documenting code would have better provided a clue as to what the engineer should expect. In this case, the community stuck with convention - a convention that's already been adopted but is not wide spread or widely known.
There have been many examples where the PHP community has stopped creating developer friendly APIs in a quest for cleaner code. A perfect example of this is Guzzle dropping ->json
and ->xml
methods from their response objects. While it did make the Guzzle Response object strictly adhere to PSR-7, it ignored how end users were using the library:
// this code
$data = $guzzle->send($request)->json();
// became
$data = json_decode($guzzle->send($request)->getBody(), true);
if (json_last_error() === JSON_ERROR_NONE) {
// throw exception
}
In all honesty, the manual JSON decode and error checking is easy to add, but it is certainly not developer friendly. It is not a library I'd enjoy working with. Developers of libraries should focus on how their code is going to be used rather than some metric of clean code.
The quickest solution to correctly logging the error response would be to cast the response body to a string or directly called $response->getBody()->__toString()
. It could have been prevented if the developer had read the docs. It could have been prevented if getBody()
returned the entire response as a string and provided getBodyStream()
that returned the body as the stream. Even better yet, it could have been prevented if getContents()
had been renamed to accurately describe what it was doing. In the end it was a simple error that could have been resolved with many different solutions. Unfortunately, I don't think this will be the last time we’ll see engineers complaining about unexpected results from getContents()
.