ainatipen 发表于 2022-4-6 07:26

grpc错误处理的正确姿势

Standard error model

概述


grpc调用执行成功,会返回OK(code=0)给客户端
当发生错误时,grpc返回一个错误状态码,再加上一个message,
在http2协议中grpc错误码 + message是在trailer中返回的,如下图所示:


image.png

这种错误模型是grpc官方实现的,grpc的所有语言都支持
StatusCode


grpc.StatusCode 有如下定义:
@enum.uniqueclass StatusCode(enum.Enum):    OK = (_cygrpc.StatusCode.ok, 'ok')    CANCELLED = (_cygrpc.StatusCode.cancelled, 'cancelled')    UNKNOWN = (_cygrpc.StatusCode.unknown, 'unknown')    INVALID_ARGUMENT = (_cygrpc.StatusCode.invalid_argument, 'invalid argument')    DEADLINE_EXCEEDED = (_cygrpc.StatusCode.deadline_exceeded,                         'deadline exceeded')    NOT_FOUND = (_cygrpc.StatusCode.not_found, 'not found')    ALREADY_EXISTS = (_cygrpc.StatusCode.already_exists, 'already exists')    PERMISSION_DENIED = (_cygrpc.StatusCode.permission_denied,                         'permission denied')    RESOURCE_EXHAUSTED = (_cygrpc.StatusCode.resource_exhausted,                        'resource exhausted')    FAILED_PRECONDITION = (_cygrpc.StatusCode.failed_precondition,                           'failed precondition')    ABORTED = (_cygrpc.StatusCode.aborted, 'aborted')    OUT_OF_RANGE = (_cygrpc.StatusCode.out_of_range, 'out of range')    UNIMPLEMENTED = (_cygrpc.StatusCode.unimplemented, 'unimplemented')    INTERNAL = (_cygrpc.StatusCode.internal, 'internal')    UNAVAILABLE = (_cygrpc.StatusCode.unavailable, 'unavailable')    DATA_LOSS = (_cygrpc.StatusCode.data_loss, 'data loss')    UNAUTHENTICATED = (_cygrpc.StatusCode.unauthenticated, 'unauthenticated')
_cygrpc.StatusCode 对应的错误码数字:
class StatusCode:    # no doc    aborted = 10    already_exists = 6    cancelled = 1    data_loss = 15    deadline_exceeded = 4    failed_precondition = 9    internal = 13    invalid_argument = 3    not_found = 5    ok = 0    out_of_range = 11    permission_denied = 7    resource_exhausted = 8    unauthenticated = 16    unavailable = 14    unimplemented = 12    unknown = 2    __qualname__ = 'StatusCode'使用方式


直接设置返回码
class HelloServicer(hello_pb2_grpc.HelloServiceServicer):    def SayHelloStrict(self, request, context):      if len(request.Name) >= 10:            msg = 'Length of `Name` cannot be more than 10 characters'            context.set_details(msg)            context.set_code(grpc.StatusCode.INVALID_ARGUMENT)            return hello_pb2.HelloResp()      return hello_pb2.HelloResp(Result="Hey, {}!".format(request.Name))
使用abort:
class HelloServicer(hello_pb2_grpc.HelloServiceServicer):    def SayHelloStrict(self, request, context):      if len(request.Name) >= 10:            msg = 'Length of `Name` cannot be more than 10 characters'            context.abort(grpc.StatusCode.INVALID_ARGUMENT, msg)      return hello_pb2.HelloResp(Result="Hey, {}!".format(request.Name))
客户端
try:   response = stub.SayHelloStrict(hello_pb2.HelloReq(         Name='Leonhard Euler')) except grpc.RpcError as e:   # ouch!   # lets print the gRPC error message   # which is "Length of `Name` cannot be more than 10 characters"   print(e.details())   # lets access the error code, which is `INVALID_ARGUMENT`   # `type` of `status_code` is `grpc.StatusCode`   status_code = e.code()   # should print `INVALID_ARGUMENT`   print(status_code.name)   # should print `(3, 'invalid argument')`   print(status_code.value)   # want to do some specific action based on the error?   if grpc.StatusCode.INVALID_ARGUMENT == status_code:         # do your stuff here         pass else:   print(response.Result)
完整demo查看: https://github.com/avinassh/grpc-errors/
Richer error model

概述


从上面的Standard error model可以看出,grpc官方的错误模型通用型较好,和grpc使用的传输数据格式无关(和是否使用protobuf无关), 但是功能比较有限,只能返回一个 code+ 一个msg,无法支持更丰富的错误返回

如果使用protobuf的话(谁不用呢。。。),可以使用google开发以及自用的一套错误模型,在google cloud的api的文档中:https://cloud.google.com/apis/design/errors#error_model, 支持的语言包括:C++, Go, Java, Python, and Ruby
忍不住吐槽下,grpc的官方文档基本仅限于tutorial,有太多需要从官方文档之外收集的东西

rich error model定义了如下的错误模型:
// googleapi的仓库中: https://github.com/googleapis/googleapis/blob/master/google/rpc/status.protopackage google.rpc;// The `Status` type defines a logical error model that is suitable for// different programming environments, including REST APIs and RPC APIs.message Status {// A simple error code that can be easily handled by the client. The// actual error code is defined by `google.rpc.Code`.int32 code = 1;// A developer-facing human-readable error message in English. It should// both explain the error and offer an actionable resolution to it.string message = 2;// Additional error information that the client code can use to handle// the error, such as retry info or a help link.repeated google.protobuf.Any details = 3;}code


code取值的定义: https://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto
实际这个取值和grpc.StatusCode中定义的值是完全一样的
Google API 要求必须使用 google.rpc.Code 定义的规范错误代码。单个 API 应避免定义其他错误代码,因为开发人员不太可能编写用于处理大量错误代码的逻辑。作为参考,每个 API 调用平均处理 3 个错误代码意味着大多数应用的逻辑只是用于错误处理,这对开发人员而言并非好体验。
message


错误内容的文字解释
details


是个Any类型的Array,可以pack进任何protobuf的数据
google定义了一套标准的error payloads: https://github.com/googleapis/googleapis/blob/master/google/rpc/error_details.proto
它们涵盖了对于 API 错误的最常见需求,例如配额失败和无效参数。与错误代码一样,开发者应尽可能使用这些标准载荷
HTTP映射


google定义了该错误模型和JSON HTTP API的映射规则, eg:
{"error": {    "code": 400,    "message": "API key not valid. Please pass a valid API key.",    "status": "INVALID_ARGUMENT",    "details": [      {      "@type": "type.googleapis.com/google.rpc.ErrorInfo",      "reason": "API_KEY_INVALID",      "domain": "googleapis.com",      "metadata": {          "service": "translate.googleapis.com"      }      }    ]}}
code和http状态码的映射示例如下:


image.png

完整的映射关系查看: https://cloud.google.com/apis/design/errors#error_model
或者查看code.proto的注释: https://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto
使用方式


server端
def create_greet_limit_exceed_error_status(name):    detail = any_pb2.Any()    detail.Pack(      error_details_pb2.QuotaFailure(violations=[            error_details_pb2.QuotaFailure.Violation(                subject="name: %s" % name,                description="Limit one greeting per person",            )      ],))    return status_pb2.Status(      code=code_pb2.RESOURCE_EXHAUSTED,      message='Request limit exceeded.',      details=,    )class LimitedGreeter(helloworld_pb2_grpc.GreeterServicer):    def __init__(self):      self._lock = threading.RLock()      self._greeted = set()    def SayHello(self, request, context):      with self._lock:            if request.name in self._greeted:                rich_status = create_greet_limit_exceed_error_status(                  request.name)                context.abort_with_status(rpc_status.to_status(rich_status))            else:                self._greeted.add(request.name)      return helloworld_pb2.HelloReply(message='Hello, %s!' % request.name)
client端:
def process(stub):    try:      response = stub.SayHello(helloworld_pb2.HelloRequest(name='Alice'))      _LOGGER.info('Call success: %s', response.message)    except grpc.RpcError as rpc_error:      _LOGGER.error('Call failure: %s', rpc_error)      status = rpc_status.from_call(rpc_error)      for detail in status.details:            if detail.Is(error_details_pb2.QuotaFailure.DESCRIPTOR):                info = error_details_pb2.QuotaFailure()                detail.Unpack(info)                _LOGGER.error('Quota failure: %s', info)            else:                raise RuntimeError('Unexpected failure: %s' % detail)
完整demo查看: https://github.com/grpc/grpc/tree/master/examples/python/errors
grpc错误处理的一个小坑

grpc 未捕获的异常会导致输出异常信息给客户端,略坑, 不过如多定义为内部调用的话,应该问题不大:


image.png

参考

https://grpc.io/docs/guides/error
https://github.com/grpc/grpc/tree/master/examples/python/errors
https://cloud.google.com/apis/design/errors#error_model
https://realpython.com/python-microservices-grpc/#why-rpc-and-protocol-buffers
https://hpbn.co/http2/
https://imququ.com/post/protocol-negotiation-in-http2.html
页: [1]
查看完整版本: grpc错误处理的正确姿势