Overview
GitHub uses a Personal Access Token (PAT) for API authentication. Both classic and fine-grained PATs work — fine-grained is recommended because it lets you scope the token to specific orgs and repos. Tokens are free on every GitHub plan; rate limits and which org-level data you can read depend on your account type and (for fine-grained) on the resource owner's approval.
Setup guide
Create the token
- Sign in at github.com.
- Open Settings → Developer settings → Personal access tokens and pick either Fine-grained tokens (recommended) or Tokens (classic).
- Click Generate new token, give it a name like
ingest, and set an expiration that fits your org's policy. - Apply scopes:
- Fine-grained: under Repository permissions grant Contents: Read-only, Metadata: Read-only, Issues: Read-only, Pull requests: Read-only, Actions: Read-only, Deployments: Read-only, Environments: Read-only. For org-level data also set Organization permissions → Members: Read-only, and choose the org as the Resource owner.
- Classic: select
repo(covers all repo read),read:org(orgs and members), andread:user(authenticated user).
- Generate and copy the token immediately. GitHub only shows the value once; tokens start with
github_pat_(fine-grained) orghp_(classic).
Add it to Ingest
In the Ingest UI under Connectors → GitHub, paste the token. Ingest stores it in AWS Secrets Manager under the key token.
Mind the limits
The Ingest runtime dispatches GitHub requests at 1 req/sec by default — well under the 5,000/hour primary cap — and uses AIMD backoff on 429s. Status 401 is treated as fatal so the request stops immediately if the token is invalid or revoked; 429 and 500/502/503/504 retry with exponential backoff up to five times. Watch the secondary "100 concurrent requests" and "900 points/minute" limits if you enable many fan-out endpoints (per-PR reviews/files/commits, per-repo workflow runs) across a large repo set.
Pick endpoints
Start with user_repos — every per-repo endpoint downstream fans out from the (owner, repo) pairs it produces. From there:
user,user_orgs,org_members— identity and org membership (org endpoints need the PAT authorized as a resource owner of that org)repo_details,repo_languages,repo_topics,repo_branches,repo_contributors— slow-changing repo metadatarepo_issues,repo_pulls,repo_issue_comments,repo_issue_events,repo_pull_review_comments,repo_labels,repo_milestones— issues and pull-request collaborationrepo_pull_reviews,repo_pull_files,repo_pull_commits— per-PR review history (one request per pull request, so volume scales with PR count)repo_commits,repo_releases,repo_tags,repo_stargazers,repo_forks— version-control and adoption signalsrepo_workflows,repo_workflow_runs,repo_deployments,repo_environments— Actions and deployment history
Supported streams
28 endpoints are available out of the box. Each endpoint syncs into its own Iceberg table in Snowflake.
| Endpoint | Description | Reference |
|---|---|---|
| org_members org_members | – | |
| repo_branches repo_branches | – | |
| repo_commits repo_commits | – | |
| repo_contributors repo_contributors | – | |
| repo_deployments repo_deployments | – | |
| repo_details repo_details | – | |
| repo_environments repo_environments | – | |
| repo_forks repo_forks | – | |
| repo_issue_comments repo_issue_comments | – | |
| repo_issue_events repo_issue_events | – | |
| repo_issues repo_issues | – | |
| repo_labels repo_labels | – | |
| repo_languages repo_languages | – | |
| repo_milestones repo_milestones | – | |
| repo_pull_commits repo_pull_commits | – | |
| repo_pull_files repo_pull_files | – | |
| repo_pull_review_comments repo_pull_review_comments | – | |
| repo_pull_reviews repo_pull_reviews | – | |
| repo_pulls repo_pulls | – | |
| repo_releases repo_releases | – | |
| repo_stargazers repo_stargazers | – | |
| repo_tags repo_tags | – | |
| repo_topics repo_topics | – | |
| repo_workflow_runs repo_workflow_runs | – | |
| repo_workflows repo_workflows | – | |
| user user | – | |
| user_orgs user_orgs | – | |
| user_repos user_repos | – |
Authentication
- Auth type
- Bearer Token
- Sent as header
Authorization- Provider docs
- docs.github.com ↗
Performance & limits
- Rate limit
- 5,000 req/hour per authenticated PAT (~1.4 req/sec sustained). Secondary limits cap concurrent requests at 100 and content-creation at 80 req/min, but read endpoints rarely brush them. Link-header pagination at up to 100 items per page.