Bug in rate limit calculation for issue labels – excessive points consumed


#1

Hi there,

Found a bug in the rate limit calculations. Here’s the observed behaviour:

  • Get 100 issues for 1 repo with 100 comments per issue – consumes 1 point
  • Get 100 issues for 1 repo with 100 comments & 1 label per issue – consumes 101 points :fearful:

According to the docs, my understanding is that the second request should consume 2 points, not 101 ((1+1+100)/100=1 for the first request and (1+1+100+100)/100=2 for the second request).

Following the recommended approach for my other posts, here are the curl -v requests/responses. Both requests were made with fresh tokens of fresh integration installations so the rate limit would be at 5000 at the time of the request.

Get 100 issues for 1 repo with 100 comments per issue

curl -v -H "Authorization: Bearer token" -X POST -d '{ "query": "query { repository(owner: \"some owner\" name: \"some repo\") { issues(last: 100) { edges { node { comments(first: 100) { edges { node { author { avatarUrl login url } body bodyHTML createdAt } } pageInfo { endCursor hasNextPage } totalCount } } } pageInfo { startCursor hasPreviousPage } totalCount } } }" }' https://api.github.com/graphqlNick ♔  | ly(owner: \"Stepsize\" name: \"layer_desktop\") { issues(last: 100) { edges { node { comments(first: 100) { edges { node { author { avatarUrl login url } body bodyHTML createdAt } } pageInfo { endCursor hasNextPage } totalCount } } } pageInfo { startCursor hasPreviousPage } totalCount } } }" }' https://api.github.com/graphql
Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying 192.30.253.116...
* TCP_NODELAY set
* Connected to api.github.com (192.30.253.116) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate: *.github.com
* Server certificate: DigiCert SHA2 High Assurance Server CA
* Server certificate: DigiCert High Assurance EV Root CA
> POST /graphql HTTP/1.1
> Host: api.github.com
> User-Agent: curl/7.51.0
> Accept: */*
> Authorization: Bearer token
> Content-Length: 322
> Content-Type: application/x-www-form-urlencoded
> 
* upload completely sent off: 322 out of 322 bytes
< HTTP/1.1 200 OK
< Server: GitHub.com
< Date: Tue, 30 May 2017 18:07:21 GMT
< Content-Type: application/json; charset=utf-8
< Content-Length: 188499
< Status: 200 OK
< X-RateLimit-Limit: 5000
< X-RateLimit-Remaining: 4999
< X-RateLimit-Reset: 1496171241
< Cache-Control: private, max-age=60, s-maxage=60
< Vary: Accept, Authorization, Cookie, X-GitHub-OTP
< ETag: "58d1fa9f170bb64f69ddb350ba47834a"
< X-GitHub-Media-Type: github.v3; format=json
< Access-Control-Expose-Headers: ETag, Link, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval
< Access-Control-Allow-Origin: *
< Content-Security-Policy: default-src 'none'
< Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
< X-Content-Type-Options: nosniff
< X-Frame-Options: deny
< X-XSS-Protection: 1; mode=block
< X-GitHub-Request-Id: E1A1:29AE:30B08CE:3A347C7:592DB4D5
< 
{"data":{"repository":{"issues": "a bunch of issue data"}}}
* Curl_http_done: called premature == 0
* Connection #0 to host api.github.com left intact

Get 100 issues for 1 repo with 100 comments & 1 label per issue

curl -v -H "Authorization: Bearer differentToken" -X POST -d '{ "query": "query { repository(owner: \"some owner\" name: \"some repo\") { issues(last: 100) { edges { node { comments(first: 100) { edges { node { author { avatarUrl login url } body bodyHTML createdAt } } pageInfo { endCursor hasNextPage } totalCount } labels(first: 1) { edges { node { color name } } pageInfo { endCursor hasNextPage } totalCount } } } pageInfo { startCursor hasPreviousPage } totalCount } } }" }' https://api.github.com/graphql
Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying 192.30.253.116...
* TCP_NODELAY set
* Connected to api.github.com (192.30.253.116) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate: *.github.com
* Server certificate: DigiCert SHA2 High Assurance Server CA
* Server certificate: DigiCert High Assurance EV Root CA
> POST /graphql HTTP/1.1
> Host: api.github.com
> User-Agent: curl/7.51.0
> Accept: */*
> Authorization: Bearer differentToken
> Content-Length: 419
> Content-Type: application/x-www-form-urlencoded
> 
* upload completely sent off: 419 out of 419 bytes
< HTTP/1.1 200 OK
< Server: GitHub.com
< Date: Tue, 30 May 2017 18:14:48 GMT
< Content-Type: application/json; charset=utf-8
< Content-Length: 197682
< Status: 200 OK
< X-RateLimit-Limit: 5000
< X-RateLimit-Remaining: 4899
< X-RateLimit-Reset: 1496171688
< Cache-Control: private, max-age=60, s-maxage=60
< Vary: Accept, Authorization, Cookie, X-GitHub-OTP
< ETag: "ae86f1bf67b1035bc8696c6221d7e934"
< X-GitHub-Media-Type: github.v3; format=json
< Access-Control-Expose-Headers: ETag, Link, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval
< Access-Control-Allow-Origin: *
< Content-Security-Policy: default-src 'none'
< Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
< X-Content-Type-Options: nosniff
< X-Frame-Options: deny
< X-XSS-Protection: 1; mode=block
< X-GitHub-Request-Id: E1D0:29AE:30C10E5:3A48ACC:592DB693
< 
{"data":{"repository":{"issues": "a bunch of issue data"}}}
* Curl_http_done: called premature == 0
* Connection #0 to host api.github.com left intact

Thanks
Nick


#3

I’m actually observing the same behaviour for PRs (same scenario, swapping issues for pullRequests).

But here’s the kicker: if, for each PR, I don’t just get 100 comments & 1 label but also 10 assignees, then the cost of the request drops from 101 points to 21 points, even though I’m now requesting more data…

The documentation on rate limiting is very sparse (please, please flesh it out!) so I may be missing something, but I’m pretty sure I’m not.

I’m now starting to question my choice of building our product on top of the v4 API. After trying to build the integration for the past few days, it really doesn’t seem like it’s production-ready… Have I just stepped on all the landmines and I’m exaggerating after a frustrating day, or is there some truth to this not really being a production release?


#4

Hey @nomeyer!

I gave each of those requests in your first message a try. For this request, I get a cost of 1:

Get 100 issues for 1 repo with 100 comments per issue

{
  rateLimit {
    cost
  }
  repository(owner: "github", name: "linguist") {
    issues(last: 100) {
      edges {
        node {
          comments(first: 100) {
            edges {
              node {
                author {
                  avatarUrl
                  login
                  url
                }
                body
                bodyHTML
                createdAt
              }
            }
            pageInfo {
              endCursor
              hasNextPage
            }
            totalCount
          }
        }
      }
      pageInfo {
        startCursor
        hasPreviousPage
      }
      totalCount
    }
  }
}

For this request, I get a cost of 2:

Get 100 issues for 1 repo with 100 comments & 1 label per issue

{
  rateLimit {
    cost
  }
  repository(owner: "github", name: "linguist") {
    issues(last: 100) {
      edges {
        node {
          comments(first: 100) {
            edges {
              node {
                author {
                  avatarUrl
                  login
                  url
                }
                body
                bodyHTML
                createdAt
              }
            }
            pageInfo {
              endCursor
              hasNextPage
            }
            totalCount
          }
          labels(first: 1) {
            edges {
              node {
                color
                name
              }
            }
          }
        }
      }
      pageInfo {
        startCursor
        hasPreviousPage
      }
      totalCount
    }
  }
}

For this request, I get a cost of 3:

But here’s the kicker: if, for each PR, I don’t just get 100 comments & 1 label but also 10 assignees, then the cost of the request drops from 101 points to 21 points, even though I’m now requesting more data…

{
  rateLimit {
    cost
  }
  repository(owner: "github", name: "linguist") {
    issues(last: 100) {
      edges {
        node {
          comments(first: 100) {
            edges {
              node {
                author {
                  avatarUrl
                  login
                  url
                }
                body
                bodyHTML
                createdAt
              }
            }
            pageInfo {
              endCursor
              hasNextPage
            }
            totalCount
          }
          labels(first: 1) {
            edges {
              node {
                color
                name
              }
            }
          }
          assignees(first: 10) {
            edges {
              node {
                id
              }
            }
          }
        }
      }
      pageInfo {
        startCursor
        hasPreviousPage
      }
      totalCount
    }
  }
}

If you’re still seeing a different cost for either of these – could you send back an updated curl -v snippet including the rateLimit { cost } in the query so I can take a look?

The documentation on rate limiting is very sparse (please, please flesh it out!) so I may be missing something, but I’m pretty sure I’m not.

I’d love to learn more about what you mean. Could you let us know what kinds of additional documentation would put you more at ease about this topic? With your input, I can pass this to our Documentation team for awareness and visibility!

I’m now starting to question my choice of building our product on top of the v4 API. After trying to build the integration for the past few days, it really doesn’t seem like it’s production-ready… Have I just stepped on all the landmines and I’m exaggerating after a frustrating day, or is there some truth to this not really being a production release?

I hear what you’re saying, thanks for sharing these thoughts with us too! I wanted to let you know that we’re aware that things aren’t perfect, yet we’re working as quickly as we can to fix these issues. We’d love for you to keep using the GraphQL API and though we’re not yet at feature parity with the REST API, we’re actively looking for schema requests and things that will allow us to prioritize what to add.

I hope that helps – please let us know how else we can be of help!


#5

Hey @francisfuzz, thanks for this!

I can confirm that for all the requests discussed above I now consume the expected number of points, which match those you’ve listed above! Great :slight_smile:

Sure, so it’s two things mainly:

  • Documenting the notion of max query complexity mentioned here.
  • Explaining in greater detail how to “add up the number of requests needed to fulfil each unique connection in the call” (step 1 here). I was usually able to guess what this number was for my queries and now I don’t really feel like I need further clarification on this (the bugs mentioned above didn’t help either), but for newcomers I think it’ll be helpful the document this more thoroughly than by just providing one example.

At the time of writing, I also needed clarification about the max number of nodes per request but that’s been clarified now :slight_smile:

Understood, thanks. In my original post I was mostly referring bugs. Good to see the rate limit calculation bug has been fixed – that was a major one. 2 other major issues are the spurious 405 responses I mentioned here and that have been mentioned by others as well I believe, and the issues retrieving PR reviews mentioned here. From what I can tell both bugs are still live – I can work around them for the coming weeks but not much longer than that.

In terms of schema requests, I’ve already put a few through and although they didn’t all get a response I trust they’ve been taken into consideration. They’re not currently blockers except for this one (not sure if it’s a bug or a feature-gap, think it may be a bug).

I’ve observed that bugs are fixed faster than schema requests are actioned so I’ll keep working with the v4 API.

Thanks for your response!