grpc错误处理的正确姿势

ainatipen · 发表于 2022-4-6 07:26

Standard error model

概述

grpc调用执行成功，会返回OK(code=0)给客户端
当发生错误时，grpc返回一个错误状态码，再加上一个message，
在http2协议中grpc错误码 + message是在trailer中返回的，如下图所示：

image.png

这种错误模型是grpc官方实现的，grpc的所有语言都支持
StatusCode

grpc.StatusCode 有如下定义:
@enum.uniqueclass StatusCode(enum.Enum): OK = (_cygrpc.StatusCode.ok, 'ok') CANCELLED = (_cygrpc.StatusCode.cancelled, 'cancelled') UNKNOWN = (_cygrpc.StatusCode.unknown, 'unknown') INVALID_ARGUMENT = (_cygrpc.StatusCode.invalid_argument, 'invalid argument') DEADLINE_EXCEEDED = (_cygrpc.StatusCode.deadline_exceeded,                      'deadline exceeded') NOT_FOUND = (_cygrpc.StatusCode.not_found, 'not found') ALREADY_EXISTS = (_cygrpc.StatusCode.already_exists, 'already exists') PERMISSION_DENIED = (_cygrpc.StatusCode.permission_denied,                      'permission denied') RESOURCE_EXHAUSTED = (_cygrpc.StatusCode.resource_exhausted,                         'resource exhausted') FAILED_PRECONDITION = (_cygrpc.StatusCode.failed_precondition,                         'failed precondition') ABORTED = (_cygrpc.StatusCode.aborted, 'aborted') OUT_OF_RANGE = (_cygrpc.StatusCode.out_of_range, 'out of range') UNIMPLEMENTED = (_cygrpc.StatusCode.unimplemented, 'unimplemented') INTERNAL = (_cygrpc.StatusCode.internal, 'internal') UNAVAILABLE = (_cygrpc.StatusCode.unavailable, 'unavailable') DATA_LOSS = (_cygrpc.StatusCode.data_loss, 'data loss') UNAUTHENTICATED = (_cygrpc.StatusCode.unauthenticated, 'unauthenticated')
_cygrpc.StatusCode 对应的错误码数字：
class StatusCode: # no doc aborted = 10 already_exists = 6 cancelled = 1 data_loss = 15 deadline_exceeded = 4 failed_precondition = 9 internal = 13 invalid_argument = 3 not_found = 5 ok = 0 out_of_range = 11 permission_denied = 7 resource_exhausted = 8 unauthenticated = 16 unavailable = 14 unimplemented = 12 unknown = 2 __qualname__ = 'StatusCode'使用方式

直接设置返回码
class HelloServicer(hello_pb2_grpc.HelloServiceServicer): def SayHelloStrict(self, request, context):       if len(request.Name) >= 10:          msg = 'Length of `Name` cannot be more than 10 characters'          context.set_details(msg)          context.set_code(grpc.StatusCode.INVALID_ARGUMENT)          return hello_pb2.HelloResp()       return hello_pb2.HelloResp(Result="Hey, {}!".format(request.Name))
使用abort:
class HelloServicer(hello_pb2_grpc.HelloServiceServicer): def SayHelloStrict(self, request, context):       if len(request.Name) >= 10:          msg = 'Length of `Name` cannot be more than 10 characters'          context.abort(grpc.StatusCode.INVALID_ARGUMENT, msg)       return hello_pb2.HelloResp(Result="Hey, {}!".format(request.Name))
客户端
try: response = stub.SayHelloStrict(hello_pb2.HelloReq(       Name='Leonhard Euler')) except grpc.RpcError as e:    # ouch!    # lets print the gRPC error message    # which is "Length of `Name` cannot be more than 10 characters"    print(e.details())    # lets access the error code, which is `INVALID_ARGUMENT`    # `type` of `status_code` is `grpc.StatusCode`    status_code = e.code()    # should print `INVALID_ARGUMENT`    print(status_code.name)    # should print `(3, 'invalid argument')`    print(status_code.value)    # want to do some specific action based on the error?    if grpc.StatusCode.INVALID_ARGUMENT == status_code:       # do your stuff here       pass else:    print(response.Result)
完整demo查看: https://github.com/avinassh/grpc-errors/
Richer error model

概述

从上面的Standard error model可以看出，grpc官方的错误模型通用型较好，和grpc使用的传输数据格式无关(和是否使用protobuf无关)，但是功能比较有限，只能返回一个 code  + 一个msg，无法支持更丰富的错误返回

如果使用protobuf的话(谁不用呢。。。)，可以使用google开发以及自用的一套错误模型，在google cloud的api的文档中：https://cloud.google.com/apis/design/errors#error_model，支持的语言包括：C++, Go, Java, Python, and Ruby
忍不住吐槽下，grpc的官方文档基本仅限于tutorial,  有太多需要从官方文档之外收集的东西

rich error model定义了如下的错误模型：
// googleapi的仓库中: https://github.com/googleapis/googleapis/blob/master/google/rpc/status.protopackage google.rpc;// The `Status` type defines a logical error model that is suitable for// different programming environments, including REST APIs and RPC APIs.message Status {  // A simple error code that can be easily handled by the client. The  // actual error code is defined by `google.rpc.Code`.  int32 code = 1;  // A developer-facing human-readable error message in English. It should  // both explain the error and offer an actionable resolution to it.  string message = 2;  // Additional error information that the client code can use to handle  // the error, such as retry info or a help link.  repeated google.protobuf.Any details = 3;}code

code取值的定义： https://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto
实际这个取值和grpc.StatusCode中定义的值是完全一样的
Google API 要求必须使用 google.rpc.Code 定义的规范错误代码。单个 API 应避免定义其他错误代码，因为开发人员不太可能编写用于处理大量错误代码的逻辑。作为参考，每个 API 调用平均处理 3 个错误代码意味着大多数应用的逻辑只是用于错误处理，这对开发人员而言并非好体验。
message

错误内容的文字解释
details

是个Any类型的Array，可以pack进任何protobuf的数据
google定义了一套标准的error payloads： https://github.com/googleapis/googleapis/blob/master/google/rpc/error_details.proto
它们涵盖了对于 API 错误的最常见需求，例如配额失败和无效参数。与错误代码一样，开发者应尽可能使用这些标准载荷
HTTP映射

google定义了该错误模型和JSON HTTP API的映射规则, eg:
{  "error": { "code": 400, "message": "API key not valid. Please pass a valid API key.", "status": "INVALID_ARGUMENT", "details": [    {       "@type": "type.googleapis.com/google.rpc.ErrorInfo",       "reason": "API_KEY_INVALID",       "domain": "googleapis.com",       "metadata": {       "service": "translate.googleapis.com"       }    } ]  }}
code和http状态码的映射示例如下：

image.png

完整的映射关系查看： https://cloud.google.com/apis/design/errors#error_model
或者查看code.proto的注释： https://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto
使用方式

server端
def create_greet_limit_exceed_error_status(name): detail = any_pb2.Any() detail.Pack( error_details_pb2.QuotaFailure(violations=[ error_details_pb2.QuotaFailure.Violation( subject="name: %s" % name, description="Limit one greeting per person", ) ],)) return status_pb2.Status( code=code_pb2.RESOURCE_EXHAUSTED, message='Request limit exceeded.', details=[detail], )class LimitedGreeter(helloworld_pb2_grpc.GreeterServicer): def __init__(self): self._lock = threading.RLock() self._greeted = set() def SayHello(self, request, context): with self._lock: if request.name in self._greeted: rich_status = create_greet_limit_exceed_error_status( request.name) context.abort_with_status(rpc_status.to_status(rich_status)) else: self._greeted.add(request.name) return helloworld_pb2.HelloReply(message='Hello, %s!' % request.name)
client端：
def process(stub): try: response = stub.SayHello(helloworld_pb2.HelloRequest(name='Alice')) _LOGGER.info('Call success: %s', response.message) except grpc.RpcError as rpc_error: _LOGGER.error('Call failure: %s', rpc_error) status = rpc_status.from_call(rpc_error) for detail in status.details: if detail.Is(error_details_pb2.QuotaFailure.DESCRIPTOR): info = error_details_pb2.QuotaFailure() detail.Unpack(info) _LOGGER.error('Quota failure: %s', info) else: raise RuntimeError('Unexpected failure: %s' % detail)
完整demo查看： https://github.com/grpc/grpc/tree/master/examples/python/errors
grpc错误处理的一个小坑

grpc 未捕获的异常会导致输出异常信息给客户端，略坑, 不过如多定义为内部调用的话，应该问题不大：

image.png

参考

https://grpc.io/docs/guides/error
https://github.com/grpc/grpc/tree/master/examples/python/errors
https://cloud.google.com/apis/design/errors#error_model
https://realpython.com/python-microservices-grpc/#why-rpc-and-protocol-buffers
https://hpbn.co/http2/
https://imququ.com/post/protocol-negotiation-in-http2.html

		自动登录	找回密码
密码			立即注册

grpc错误处理的正确姿势

本帖子中包含更多资源

浏览过的版块