Library API
The Library provides programmatic access to explore all available data assets in the Carbon Arc platform. Use these APIs to discover data assets, understand their structure, preview sample data, and track changes.
Available Methods
| Method | Description |
|---|---|
client.data.get_datasets() | List all available datasets (paginated) |
client.data.get_dataset_information() | Get details for a specific dataset |
client.data.get_data_dictionary() | Get column definitions and metadata |
client.data.get_data_sample() | Preview sample data rows |
client.data.get_library_version_changes() | Check for updates and changes |
Quick Start
from carbonarc import CarbonArcClient
# Initialize the client
client = CarbonArcClient(
host="https://api.carbonarc.co",
token="YOUR_API_TOKEN"
)
List All Datasets
Retrieve a paginated list of all available data assets in the Carbon Arc library.
response = client.data.get_datasets()
Response Structure
{
"page": 1,
"size": 25,
"total_pages": 3,
"datasources": [
{
"dataset_id": ["CA0056"],
"dataset_name": "Credit Card – US Complete Panel",
"description": "US credit card transaction data...",
"provider_name": "Facteus",
"last_updated_timestamp": "2026-02-18 21:45:08",
"blocked": false,
"is_current": true,
"data": {
"Topics": ["Core Panel", "by Payment Method"],
"Key Metrics": ["Credit Card Spend", "Credit Card Transactions", "Credit Card Users"],
"Coverage": {"Product Brands": "2k+", "Retailers": "1.6k+"}
}
}
]
}
Example: Display All Datasets
import pandas as pd
response = client.data.get_datasets()
# Extract datasets from response
datasets = response.get('datasources', [])
print(f"Total Datasets: {len(datasets)}")
print(f"Page {response.get('page')} of {response.get('total_pages')}")
# Create a summary DataFrame
df = pd.DataFrame([{
'dataset_id': ds.get('dataset_id', ['N/A'])[0] if isinstance(ds.get('dataset_id'), list) else ds.get('dataset_id'),
'dataset_name': ds.get('dataset_name'),
'description': ds.get('description', '')[:80] + '...'
} for ds in datasets])
print(df.to_string(index=False))
Get Dataset Information
Retrieve detailed information about a specific data asset, including available topics.
dataset_info = client.data.get_dataset_information(dataset_id="CA0056")
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
dataset_id | string | Yes | The unique dataset identifier (e.g., "CA0056") |
Response Structure
{
"dataset_id": ["CA0056"],
"dataset_name": "Credit Card – US Complete Panel",
"description": "US credit card transaction data from 9 provider sources...",
"entity_topics": [
{
"entity_topic_id": 1,
"entity_topic_label": "Core Panel"
},
{
"entity_topic_id": 2,
"entity_topic_label": "by Payment Method"
}
]
}
Example: Explore Dataset Details
dataset_info = client.data.get_dataset_information(dataset_id="CA0056")
# Handle dataset_id that might be returned as a list
ds_id = dataset_info.get('dataset_id', 'N/A')
if isinstance(ds_id, list):
ds_id = ds_id[0] if ds_id else 'N/A'
print(f"Dataset ID: {ds_id}")
print(f"Name: {dataset_info.get('dataset_name', 'N/A')}")
print(f"Description: {dataset_info.get('description', 'N/A')}")
# List available topics
topics = dataset_info.get('entity_topics', dataset_info.get('topics', []))
if topics:
print(f"\nAvailable Topics ({len(topics)}):")
for topic in topics:
topic_id = topic.get('entity_topic_id', topic.get('id', 'N/A'))
topic_name = topic.get('entity_topic_label', topic.get('name', 'N/A'))
print(f" • {topic_name} (entity_topic_id: {topic_id})")
The entity_topic_id values returned here can be used to filter results in get_data_dictionary() and get_data_sample().
Get Data Dictionary
Retrieve column definitions and metadata for a data asset. Optionally filter by topic.
# Get dictionary for entire dataset
data_dict = client.data.get_data_dictionary(dataset_id="CA0056")
# Filter by specific topic
data_dict = client.data.get_data_dictionary(
dataset_id="CA0056",
entity_topic_id=1
)
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
dataset_id | string | Yes | The unique dataset identifier |
entity_topic_id | integer | No | Filter to a specific topic (from get_dataset_information()) |
Response Structure
The response is typically a list of column definitions:
[
{
"column_name": "date",
"data_type": "DATE",
"description": "Transaction date"
},
{
"column_name": "brand_name",
"data_type": "STRING",
"description": "Name of the brand"
},
{
"column_name": "spend",
"data_type": "FLOAT",
"description": "Total dollar value of purchases"
}
]
Example: Display Column Definitions
import pandas as pd
data_dict = client.data.get_data_dictionary(dataset_id="CA0056")
if isinstance(data_dict, list) and data_dict:
df = pd.DataFrame(data_dict)
print(f"Total Columns: {len(df)}")
# Display key columns
display_cols = ['column_name', 'data_type', 'description']
available = [c for c in display_cols if c in df.columns]
if available:
print(df[available].to_string(index=False))
Get Data Sample
Preview sample data rows from a data asset. Useful for understanding data structure before building frameworks.
# Get sample for entire dataset
sample = client.data.get_data_sample(dataset_id="CA0056")
# Filter by specific topic
sample = client.data.get_data_sample(
dataset_id="CA0056",
entity_topic_id=1
)
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
dataset_id | string | Yes | The unique dataset identifier |
entity_topic_id | integer | No | Filter to a specific topic |
Response Structure
{
"dataset_id": "CA0056",
"samples": [
{
"date": "2025-11-04",
"brand_id": 56290,
"brand_name": "Under Armour",
"company_id": 65116,
"company_name": "Under Armour, Inc",
"ticker_id": 1234,
"ticker_name": "UAA",
"spend": 125.50,
"transactions": 3,
"users": 2
}
]
}
Example: Preview Sample Data
import pandas as pd
sample_response = client.data.get_data_sample(dataset_id="CA0056")
# Extract samples from nested response
if isinstance(sample_response, dict) and 'samples' in sample_response:
samples = sample_response['samples']
if isinstance(samples, list) and samples:
df = pd.DataFrame(samples)
print(f"Sample Size: {len(df)} rows")
print(f"Columns: {list(df.columns)}")
print(df.head(10))
Check Library Version Changes
Track updates and changes to the library. Useful for monitoring when entities or data assets are added, modified, or deprecated.
# Get latest version changes
changes = client.data.get_library_version_changes(version="latest")
# With filters and pagination
changes = client.data.get_library_version_changes(
version="latest",
dataset_id="CA0056", # Optional: filter to specific dataset
page=1,
size=100
)
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
version | string | Yes | Version to check (e.g., "latest", "2026.1.1") |
dataset_id | string | No | Filter to a specific dataset |
topic_id | integer | No | Filter to a specific topic |
entity_representation | string | No | Filter by entity representation (e.g., "company", "ticker") |
page | integer | No | Page number for pagination |
size | integer | No | Number of results per page |
order | string | No | Sort direction ("asc" or "desc") |
Response Structure
{
"total": 20278,
"page": 1,
"size": 25,
"pages": 812,
"entities": [
{
"entity_id": "12345",
"entity_name": "Taylor Swift",
"entity_representation_name": "entertainer",
"entity_topic_id": 1,
"entity_topic_label": "Core Panel",
"dataset_id": "CA0010",
"status": "Entity Added",
"prev_ontology_version": "2026.1.1",
"current_ontology_version": "2026.1.2",
"version_release_date": "2026-01-29T22:36:26"
}
]
}
Example: Display Recent Changes
changes = client.data.get_library_version_changes(version="latest")
total = changes.get('total', 0)
page = changes.get('page', 1)
pages = changes.get('pages', 1)
print(f"Total Changes: {total}")
print(f"Page {page} of {pages}")
entities = changes.get('entities', [])
if entities:
print(f"\n{'Status':<20} | {'Entity Name':<30} | {'Dataset':<10} | {'Topic'}")
print("-" * 80)
for item in entities[:15]:
status = item.get('status', 'N/A')
entity_name = item.get('entity_name', 'N/A')[:30]
dataset_id = item.get('dataset_id', 'N/A')
topic = item.get('entity_topic_label', '')[:25]
print(f"{status:<20} | {entity_name:<30} | {dataset_id:<10} | {topic}")
# Show version info
if 'current_ontology_version' in entities[0]:
print(f"\nCurrent Version: {entities[0].get('current_ontology_version')}")
print(f"Release Date: {entities[0].get('version_release_date', 'N/A')}")
Complete Workflow Example
Here's a complete workflow demonstrating how to explore a data asset from discovery to data preview:
import pandas as pd
from carbonarc import CarbonArcClient
# Initialize client
client = CarbonArcClient(
host="https://api.carbonarc.co",
token="YOUR_API_TOKEN"
)
TARGET_DATASET_ID = "CA0056"
# ─────────────────────────────────────────────────────────────────
# Step 1: Get dataset information and available topics
# ─────────────────────────────────────────────────────────────────
print("STEP 1: Dataset Information")
info = client.data.get_dataset_information(dataset_id=TARGET_DATASET_ID)
print(f"Name: {info.get('dataset_name')}")
print(f"Description: {info.get('description')[:100]}...")
topics = info.get('entity_topics', info.get('topics', []))
print(f"\nAvailable Topics ({len(topics)}):")
for topic in topics:
print(f" • {topic.get('entity_topic_label')} (ID: {topic.get('entity_topic_id')})")
# ─────────────────────────────────────────────────────────────────
# Step 2: Get data dictionary to understand columns
# ─────────────────────────────────────────────────────────────────
print("\nSTEP 2: Data Dictionary")
# Use first topic if available
if topics:
first_topic_id = topics[0].get('entity_topic_id')
data_dict = client.data.get_data_dictionary(
dataset_id=TARGET_DATASET_ID,
entity_topic_id=first_topic_id
)
else:
data_dict = client.data.get_data_dictionary(dataset_id=TARGET_DATASET_ID)
if isinstance(data_dict, list):
print(f"Columns found: {len(data_dict)}")
for col in data_dict[:10]:
print(f" • {col.get('column_name')}: {col.get('data_type')}")
# ─────────────────────────────────────────────────────────────────
# Step 3: Preview sample data
# ─────────────────────────────────────────────────────────────────
print("\nSTEP 3: Sample Data Preview")
sample = client.data.get_data_sample(dataset_id=TARGET_DATASET_ID)
if isinstance(sample, dict) and 'samples' in sample:
samples = sample['samples']
if isinstance(samples, list) and samples:
df = pd.DataFrame(samples)
print(f"Sample rows: {len(df)}")
print(f"Columns: {list(df.columns)[:8]}...")
print(df.head(5))
print("\nExploration complete!")
Common Dataset IDs
Here are some commonly used dataset IDs for reference. To get the full list of all available data assets, use client.data.get_datasets().
| Dataset ID | Name | Type |
|---|---|---|
CA0056 | Credit Card – US Complete Panel | Wallet |
CA0028 | Credit Card – US Detailed Panel | Wallet |
CA0029 | POS - Convenience Stores | Wallet |
CA0034 | POS - Instore and Online | Wallet |
CA0030 | Clickstream | Attention |
CA0013 | Mobile App | Attention |
CA0054 | App Intelligence | Attention |
CA009 | Digital Advertising | Attention |
CA0049 | Medical & Pharmacy Open Claims | Balance Sheet |
CA0041 | Medicare Claims & Commercial Price Transparency | Balance Sheet |
CA0040 | Trade Claims | Logistics |
CA0025 | Freight Volume - North America | Logistics |
Error Handling
Always wrap API calls in try-except blocks to handle potential errors gracefully:
try:
dataset_info = client.data.get_dataset_information(dataset_id="CA0056")
print(f"Dataset: {dataset_info.get('dataset_name')}")
except Exception as e:
print(f"Error retrieving dataset information: {e}")
Next Steps
Once you've explored the Library and identified the data assets you need:
- Use the Ontology API to search for specific entities within data assets
- Build Frameworks to query and purchase data
- Set up Scheduled Deliveries for recurring data needs