Skip to main content

Data organization & structure

RevenueBase uses three core identifiers to track people, organizations, and the relationship between them:
  • RBID_PER — Unique person identifier. This follows the person across job changes. When someone moves to a new company, their RBID_PER stays the same.
  • RBID_ORG — Unique organization identifier.
  • RBID_PAO — Unique person-at-organization identifier. This is essentially RBID_PER combined with RBID_ORG, so it changes when a person moves to a new company.
If someone holds multiple current positions, they will have multiple rows in the contacts table — each with a unique RBID_PAO, but sharing the same RBID_PER.
These fields may be null when a person’s LinkedIn profile doesn’t link to an existing company page. The company name field on LinkedIn is free text — people can type anything and may skip linking to a recognized company page. Without that link, we can’t resolve the organization to a canonical RBID_ORG.
Some profiles in the dataset contain minimal information — a name and loose company association, but limited detail on title, seniority, or work history. These come from source profiles that simply contain less data. In some cases they may be incomplete or low-quality.The strongest indicator that a person genuinely belongs to an organization is a valid company email address. If a sparse profile has a verified email at the company domain, treat it as reliable. If it has no verified email, treat the association with lower confidence. See Data Freshness & Quality for more on assessing record quality.

Data delivery & access

RevenueBase supports three delivery methods:
  • Snowflake Data Sharing — We share tables directly to your Snowflake account. No data movement required; query in place.
  • AWS S3 bucket — We deliver files to an S3 bucket (either one we provision for you, or your own). Access via AWS CLI.
  • Gigasheet — Browser-based spreadsheet UI for searching, filtering, and exporting data without code.
Azure Blob Storage and Google Cloud Storage delivery are available on request. Contact your RevenueBase account representative to set up these delivery methods.For setup instructions, see the Quickstart.
RevenueBase supports sending dynamic feeds in three formats: CSV, JSON, and Parquet. When you get access to the data in S3, you get access to all three formats.
  • CSV — Best for spreadsheet tools, SQL imports, and lightweight pipelines
  • JSON — Best for programmatic ingestion and nested data structures
  • Parquet — Best for high-performance analytics and columnar query engines. Read more in Parquet’s official documentation.
Once we’ve provisioned the share:
  1. Navigate to Data Sharing → Private Sharing in the left panel of the Snowflake UI
  2. Click on the incoming share tabs (e.g., RELEASES_EXT_SHARE or RELEASES_RAW_SHARE)
  3. Click Get Data to add the shared database to your account
This creates a RELEASES database in your Snowflake account containing the shared tables. See the Quickstart for detailed setup steps.
Feed credentials are separate from your API key. In the dashboard, open SettingsData Feeds to see connection details and credentials.
Yes. For S3 deliveries, your bucket name and access credentials are permanent — they don’t change between monthly releases.Each delivery creates a new date-stamped folder following this path structure:
<delivery_date>/<dataset_name>/<file_format>/
  • delivery_date — Date of delivery in YYYYMMDD format
  • dataset_name — Depends on your subscription (e.g., per, org)
  • file_format — One of json, csv, or parquet
Previous deliveries remain accessible at their original paths and are never overwritten.

Data updates & refresh

New data releases target the 1st of each month. Depending on weekends, the actual date may shift by ±2–3 days. We send an email notification when new data is posted and ready.
Two verification cycles run in parallel:
  • Profile verification — 95% of profiles are re-verified every 90 days. “Re-verified” means the source profile was accessible and we were able to confirm or update at least the core profile information (name, title, company, location). Check the updated_at field for the most recent verification date on any record.
  • Email verification — All email addresses are re-verified every 60 days. Check the email_last_verified_at field. A valid email is a strong signal that the associated work experience is still current.
See Data Freshness & Quality for the full explanation of how verification works.
Profiles may have older updated_at dates for several reasons:
  • Source accessibility issues — The profile returned a 404 or was temporarily unreachable during the verification pass. Profiles that remain inaccessible for one year are deprecated and removed from the dataset.
  • Platform scraping protections — LinkedIn periodically hides full profiles or experience sections from public view. This affects random profiles and can last days to months.
  • Delayed updates by individuals — People often change jobs weeks or months before updating their profiles. Someone who starts a new role in May might not update LinkedIn until September, and we’d capture that change in October or November.
See Data Freshness & Quality for detailed guidance on working with these records.

Company data & verification

We detect whether a LinkedIn company page appears to be actively managed by checking for custom content such as a logo, website address, company size, and description.This method is not 100% accurate. Some pages may have extensive information (phone numbers, addresses, founding dates) but still appear unclaimed. The challenge is that checking pages without being logged in doesn’t show LinkedIn’s unclaimed banner, so our system relies on indirect signals. We’re actively improving the accuracy of this detection.
LinkedIn periodically rotates logo URLs to protect their assets. Updated URLs are included in each monthly data refresh, though they may eventually face the same issue.The most reliable solution is to build a logo cache on your end and populate it incrementally with each delivery. This way you maintain persistent access regardless of URL rotation.

API

Sign in to the RevenueBase dashboard, go to SettingsAPI Keys, and click Create API key. Copy and store the key securely. You can view and copy your API key anytime from SettingsAPI Keys in the dashboard.
Check that you are sending the key in the x-key header (not Authorization). Ensure the key is correct and has not been revoked.
You have exceeded the rate limit. Wait for the window to reset or implement exponential backoff. See Rate limits and error codes.
The request body or parameters failed validation. The response detail array lists each invalid field and message. Fix those fields and retry.

Troubleshooting

Verify host, port, and TLS settings. Ensure you are using the feed username and password, not the API key. Check firewall and network rules.
Table and schema access depend on your plan. Confirm the table name and schema (e.g., public) and check the Tables & Schema and table pages (Contacts, Companies, Insights).
For S3 access errors, Snowflake share issues, missing RBID fields, and logo URL errors, see the dedicated Troubleshooting page.

Still have questions?