Skip to main content

Pagination Documentation

This document provides comprehensive documentation for all pagination types supported by the pipeline generator ingestion system.

Table of Contents

  1. Overview
  2. Decision Guide
  3. Common Concepts
  4. Pagination Types
  5. Examples
  6. Best Practices & Troubleshooting

Overview

The pagination system supports multiple pagination strategies commonly used by REST APIs. Each pagination type is implemented as a handler class that extends the base PaginationHandler abstract class. The system uses a factory pattern (PaginationFactory) to instantiate the appropriate pagination handler based on the pagination object configuration.


Decision Guide

Quick Comparison Table

FeatureDefaultPaginationWithTotalPagesPageOffsetOffsetCursorNextPageUrl
Termination MethodSingle requestTotal pages countResult countResult countResult count or cursorNext URL presence
Page ParameterNonePage numberPage numberOffset valueCursor tokenCursor from URL
Limit ParameterN/ANoOptionalRequiredOptionalRequired
API Provides Total Pages?N/A✅ Yes❌ No❌ No❌ No❌ No
Use CaseNo paginationAPIs with page countPage-based without countOffset-basedToken-basedURL-based
Example APIsSingle response APIsMyContactCenterMost REST APIsSQL-style APIsGraphQL, TwitterRelay connections
Request PatternGET /api?page=1, ?page=2...?page=1&limit=50?offset=0&limit=100?cursor=abc?page[after]=xyz
Stops WhenAfter 1 requestcurrent_page > totalPagesresults < maxEntriesresults < maxEntriesno cursor or results < maxEntriesno next URL

Decision Flowchart

Does the API use pagination?

├─ NO → Use "default"

└─ YES → How does the API indicate the next page?

├─ Provides complete next page URL?
│ └─ YES → Use "nextPageUrl"

├─ Uses cursor/token for next page?
│ └─ YES → Use "cursor"

└─ Uses page numbers or offset?

├─ Uses page numbers (1, 2, 3...)?
│ │
│ ├─ API provides total pages count?
│ │ ├─ YES → Use "PaginationWithTotalPages"
│ │ └─ NO → Use "pageOffset"
│ │
│ └─ Uses offset values (0, 100, 200...)?
│ └─ Use "offset"

Key Differences: PaginationWithTotalPages vs PageOffset

This is the most common confusion point. Here's the key difference:

AspectPaginationWithTotalPagesPageOffset
Termination Logiccurrent_page > totalPageslen(results) < maxEntries
Requires totalPages field✅ Yes (required)❌ No
Checks result count❌ No✅ Yes
API must provide page count✅ Yes❌ No
More efficient✅ Yes (knows end point)⚠️ No (must check each page)

PaginationWithTotalPages:

  • Use when API explicitly provides total page count in response (header or body)
  • Stops based on total pages: current_page > totalPages
  • Example: X-Pagination: {"TotalPages": 10, "CurrentPage": 1}

PageOffset:

  • Use when API uses page numbers but does NOT provide total page count
  • Stops when returned results are less than expected: results < maxEntriesAllowedInPage
  • Example: Returns 50 items per page, then 30 on last page → stops

Common Concepts

Pagination Object Structure

All pagination objects share common fields:

  • type (required): The pagination type identifier
  • resultsPath (required): Runtime expression path to extract results from the response
  • maxEntriesAllowedInPage (optional): Maximum number of entries per page/request

Wildcard Pagination

You can use wildcards in resultsPath:

# Simple wildcard
resultsPath: "$response.body#/*"

# Nested wildcard
resultsPath: "$response.body#/data/*"

# Alternative syntax
resultsPath: "$response.body#/{auto}"

Runtime Expression Syntax

Runtime expressions use the following format:

$response.<source>.<reference>#/<json_path>

Sources:

  • header: Extract from HTTP response headers
  • body: Extract from HTTP response body
  • query: Extract from query parameters (future use)
  • path: Extract from path parameters (future use)

Examples:

  • $response.body#/items - Extract items array from response body
  • $response.header.pagination#/TotalPages - Extract TotalPages from pagination header (JSON)
  • $response.header.X-Total-Count - Extract total count from header (string)

Note: The # symbol indicates that the header value is a JSON object. If # is not present, the header is treated as a plain string.


Pagination Types

1. Default Pagination

Type Identifier: default

Use Case: APIs that return all data in a single response without pagination.

Configuration:

type: "default"
resultsPath: "$response.body#/items"
maxEntriesAllowedInPage: 100

Behavior:

  • Makes exactly one request
  • Returns empty parameter map (no pagination parameters)
  • Terminates after the first response

2. Pagination with Total Pages

Type Identifier: PaginationWithTotalPages

Use Case: APIs like MyContactCenter that return total pages in response headers or body.

Configuration:

type: "PaginationWithTotalPages"
pageOffsetParam: "Page"
totalPages: "$response.header.pagination#/TotalPages"
resultsPath: "$response.body#/contacts"
maxEntriesAllowedInPage: 100

Fields:

  • pageOffsetParam (required): Query parameter name for page number (e.g., "page", "Page", "pageNumber")
  • totalPages (required): Runtime expression to extract total pages count from response
  • resultsPath (required): Runtime expression to extract results array from response
  • maxEntriesAllowedInPage: Maximum entries per page (informational, not enforced)

Behavior:

  • Starts at page 1
  • Increments page number for each request
  • Extracts total pages from response (header or body)
  • Terminates when current page exceeds total pages

Request Sequence:

  1. GET /api/contacts?Page=1 → Response: TotalPages: 10
  2. GET /api/contacts?Page=2
  3. GET /api/contacts?Page=3
  4. ... continues until page > 10

3. Page Offset Pagination

Type Identifier: pageOffset

Use Case: APIs that use page-based pagination with optional page size limits but don't provide total page count.

Configuration:

type: "pageOffset"
pageOffsetParam: "page"
limitParam: "per_page" # Optional, can be empty string
resultsPath: "$response.body#/items"
maxEntriesAllowedInPage: 100

Fields:

  • pageOffsetParam (required): Query parameter name for page number
  • limitParam (optional): Query parameter name for page size limit (can be empty string if not used)
  • resultsPath (required): Runtime expression to extract results array from response
  • maxEntriesAllowedInPage (required): Maximum entries per page

Behavior:

  • Starts at page 1
  • Increments page number for each request
  • Includes limitParam in request if specified
  • Terminates when returned results count < maxEntriesAllowedInPage

Request Sequence:

  1. GET /api/users?page=1&per_page=50 → Returns 50 items
  2. GET /api/users?page=2&per_page=50 → Returns 50 items
  3. GET /api/users?page=3&per_page=50 → Returns 30 items (< 50) → STOPS

4. Offset Pagination

Type Identifier: offset

Use Case: APIs that use offset/limit pagination (SQL LIMIT/OFFSET style).

Configuration:

type: "offset"
offsetParam: "offset"
limitParam: "pageSize"
resultsPath: "$response.body#/items"
maxEntriesAllowedInPage: 100

Fields:

  • offsetParam (required): Query parameter name for offset value
  • limitParam (required): Query parameter name for page size limit
  • resultsPath (required): Runtime expression to extract results array from response
  • maxEntriesAllowedInPage (required): Number of entries per page (also used as limit value)

Behavior:

  • Starts at offset 0
  • Increments offset by maxEntriesAllowedInPage for each request
  • Terminates when returned results count < maxEntriesAllowedInPage

Request Sequence:

  1. GET /api/users?offset=0&limit=100
  2. GET /api/users?offset=100&limit=100
  3. GET /api/users?offset=200&limit=100
  4. ... continues until results count < 100

5. Cursor Pagination

Type Identifier: cursor

Use Case: APIs that use cursor/token-based pagination (e.g., GraphQL-style, Twitter API).

Configuration:

type: "cursor"
cursorParam: "cursor"
limitParam: "pageSize" # Optional
resultsPath: "$response.body#/items"
cursorPath: "$response.body#/pagination/nextCursor"
maxEntriesAllowedInPage: 100

Fields:

  • cursorParam (required): Query parameter name for cursor value
  • limitParam (optional): Query parameter name for page size limit (can be omitted)
  • resultsPath (required): Runtime expression to extract results array from response
  • cursorPath (required): Runtime expression to extract next cursor value from response
  • maxEntriesAllowedInPage (required): Maximum entries per page

Behavior:

  • Starts with no cursor (or empty cursor)
  • Extracts cursor from response using cursorPath
  • Uses extracted cursor for next request
  • Terminates when no cursor is returned or results count < maxEntriesAllowedInPage

Request Sequence:

  1. GET /api/users?first=50
  2. GET /api/users?after=eyJpZCI6IjEyMzQ1In0&first=50
  3. GET /api/users?after=eyJpZCI6IjY3ODkwIn0&first=50
  4. ... continues until no cursor returned

6. Next Page URL Pagination

Type Identifier: nextPageUrl

Use Case: APIs like GraphQL Relay connections or REST APIs that return complete URL for the next page.

Configuration:

type: "nextPageUrl"
nextUrlPath: "$response.body#/pagination/nextPageUrl"
cursorParam: "page[after]"
limitParam: "pageSize"
resultsPath: "$response.body#/items"
maxEntriesAllowedInPage: 100

Fields:

  • nextUrlPath (required): Runtime expression to extract next page URL from response
  • cursorParam (required): Query parameter name to extract from the next URL (e.g., "page[after]", "cursor")
  • limitParam (required): Query parameter name for page size limit
  • resultsPath (required): Runtime expression to extract results array from response
  • maxEntriesAllowedInPage (required): Maximum entries per page

Behavior:

  • First request includes only limitParam
  • Extracts next page URL from response using nextUrlPath
  • Parses cursor value from URL query parameters using cursorParam
  • Uses extracted cursor for subsequent requests
  • Terminates when no next URL is returned

Request Sequence:

  1. GET /api/users?pageSize=50
  2. Response contains: {"links": {"next": "/api/users?pageSize=50&page[after]=abc123"}}
  3. GET /api/users?pageSize=50&page[after]=abc123
  4. Response contains: {"links": {"next": "/api/users?pageSize=50&page[after]=xyz789"}}
  5. GET /api/users?pageSize=50&page[after]=xyz789
  6. ... continues until no next URL returned

Examples

Example 1: Zoho Books Contacts API (Page Offset)

type: "pageOffset"
pageOffsetParam: "page"
limitParam: "per_page"
resultsPath: "$response.body#/contacts"
maxEntriesAllowedInPage: 200

Example 2: GraphQL-style Cursor Pagination

type: "cursor"
cursorParam: "after"
limitParam: "first"
resultsPath: "$response.body#/data/users/edges"
cursorPath: "$response.body#/data/users/pageInfo/endCursor"
maxEntriesAllowedInPage: 50

Example 3: Header-based Total Pages

type: "PaginationWithTotalPages"
pageOffsetParam: "Page"
totalPages: "$response.header.pagination#/TotalPages"
resultsPath: "$response.body#/contacts"
maxEntriesAllowedInPage: 100

Example 4: Offset-based Pagination

type: "offset"
offsetParam: "skip"
limitParam: "take"
resultsPath: "$response.body#/results"
maxEntriesAllowedInPage: 100

Example 5: Next Page URL with Complex Parameter

type: "nextPageUrl"
nextUrlPath: "$response.body#/links/next"
cursorParam: "page[after]"
limitParam: "pageSize"
resultsPath: "$response.body#/data"
maxEntriesAllowedInPage: 50

Best Practices & Troubleshooting

Best Practices

  1. Always specify resultsPath: Ensures consistent result extraction
  2. Set appropriate maxEntriesAllowedInPage: Balance between API limits and request efficiency
  3. Test pagination termination: Ensure the termination condition works correctly for your API
  4. Use header pagination when available: Header-based pagination metadata is more efficient than parsing body
  5. Choose the right pagination type: Use the decision guide to select the most efficient type for your API

Common Issues

IssueCauseSolution
Infinite LoopTermination condition never metVerify maxEntriesAllowedInPage or totalPages path is correct
Missing ResultsIncorrect resultsPathLog response structure and verify path matches actual response
Wrong Page ValuesParameter names don't match APICheck API documentation for correct parameter names
Header Parsing ErrorsMissing # separator for JSON headersUse # when headers contain JSON: $response.header.X#/field
Premature TerminationmaxEntriesAllowedInPage too highSet to actual page size returned by API

Debugging Tips

  • Log the parameter_map for each request to verify parameters are correct
  • Log the response structure to verify resultsPath correctness
  • Check that termination conditions are met after the last page
  • For cursor pagination, log cursor values to ensure they're being extracted correctly
  • For next URL pagination, log parsed URLs to verify correct cursor extraction

Implementation Details

PaginationHandler Base Class: All pagination handlers extend the PaginationHandler abstract base class, which provides:

  • get_parameter_map(): Returns query parameters for the current page request
  • should_terminate_loop(): Determines if pagination should stop
  • get_next_page(): Updates internal state for next iteration
  • update_response(): Updates the response object after each request
  • get_results(): Extracts results from response using resultsPath

Factory Pattern: The PaginationFactory class instantiates the appropriate handler:

pagination_handler = PaginationFactory.get_pagination_handler(pagination_object)

Version History

  • v1.0: Initial implementation with 6 pagination types
  • Supports JSON and XML payloads
  • Runtime expression parsing for flexible path extraction