The getProgramAccounts method is one of the most commonly used RPC calls in Solana to retrieve all accounts associated with a specific program. However, it does not natively provide a parameter to directly “limit” the number of results, which can result in a large response if the program manages many accounts.
So how can you optimize its use and filter or paginate data? This is where filters and additional manual pagination techniques come into play.
Some issues are related:
Solana provides filters to reduce the size of the returned dataset from getProgramAccounts. These filters allow you to apply various conditions to the data in each account:
Filter | Parameters | Description |
---|---|---|
memcmp | { offset, bytes } | Filters accounts whose data matches (at a certain offset) the specified byte sequence. |
dataSize | Size | Returns only accounts whose data size is exactly size. |
dataSlice | { offset, length } | Returns only the slice of account data between offset and offset + length. Useful for “trimming” the returned data. |
Tip: Combining multiple filters can dramatically improve the efficiency of your queries.
Below is an example with @solana/web3.js that shows how to use the memcmp, dataSize, and dataSlice filters to reduce the response size.
import { Connection, PublicKey } from '@solana/web3.js';
async function filterProgramAccounts() {
// Connect to the Solana clusterA collection of nodes (validators, RPC endpoints, and archiv... More (e.g., mainnet-beta)
const connection = new Connection('https://api.mainnet-beta.solana.com');
// The Program ID whose accounts we want to filter
const programId = new PublicKey('ExampleProgramId...');
// Define filters
const filters = [
{
// Filter accounts by matching bytes at a specific offset
memcmp: {
offset: 0, // Byte offset
bytes: '7v1G...' // Base58-encoded sequence (example)
}
},
{
// Filter accounts whose total data size equals 200 bytes
dataSize: 200
}
];
// dataSlice to get only pubkeys (without account data)
const config = {
filters: filters,
dataSlice: { offset: 0, length: 0 } // Returns 0 bytes of data
};
try {
const accounts = await connection.getProgramAccounts(programId, config);
console.log(`Found ${accounts.length} matching accounts.`);
// Display only the pubkeys of each account
accounts.forEach((acc, index) => {
console.log(`Account #${index + 1}: ${acc.pubkey.toBase58()}`);
});
} catch (error) {
console.error('Error fetching filtered accounts:', error);
}
}
filterProgramAccounts();
Note: In this example, dataSlice is configured to retrieve 0 bytes of data, which makes it easier to get only the list of pubkeys. You can then make a second, more selective call using getAccountInfo or getMultipleAccountsInfo if you truly need the contents of certain accounts.
Because getProgramAccounts does not offer built-in pagination, a common solution is to implement manual pagination:
This method allows you to retrieve detailed information (data, lamports, etc.) for multiple accounts in one call.
Each batch’s data and proceed to the next, iterating through the entire list.
import { Connection, PublicKey } from '@solana/web3.js';
async function paginateAccounts() {
const connection = new Connection('https://api.mainnet-beta.solana.com');
const programId = new PublicKey('ExampleProgramId...');
// 1) Fetch just the Pubkeys with dataSlice
const allAccounts = await connection.getProgramAccounts(programId, {
dataSlice: { offset: 0, length: 0 }
});
// 2) Split the list into batches
const batchSize = 100;
for (let i = 0; i < allAccounts.length; i += batchSize) {
const batch = allAccounts.slice(i, i + batchSize);
// Extract only the Pubkeys
const pubkeys = batch.map(acc => acc.pubkey);
// 3) Retrieve account info for each batch
const accountsInfo = await connection.getMultipleAccountsInfo(pubkeys);
// 4) Process each account in the batch
accountsInfo.forEach((info, idx) => {
if (info) {
console.log(`Account: ${pubkeys[idx].toBase58()}`);
// Process info.data, info.lamports, etc.
}
});
console.log(`Batch processed: ${i + batchSize} / ${allAccounts.length}`);
}
}
paginateAccounts();
With this method, you avoid handling thousands of accounts in a single call that could overwhelm the RPC node and your bandwidth.
The error “many memcmp opts in filters” typically occurs when using the Solana RPC method getProgramAccountswith an excessive number of memcmp filters.
RPC providers set limits on the number of filters allowed per request to maintain stable performance and reduce server load.
To effectively resolve this issue using Python and the solana-py library, follow these steps:
Here’s a clear example demonstrating how to handle multiple filters effectively:
from solana.rpc.api import Client
from solana.publickey import PublicKey
# Initialize your RPC client
client = Client("https://api.mainnet-beta.solana.com")
program_id = PublicKey("YourProgramIDHere")
# Example of too many memcmp filters (may cause errors)
filters = [
{"memcmp": {"offset": 0, "bytes": "data1"}},
{"memcmp": {"offset": 32, "bytes": "data2"}},
# ... potentially more filters causing errors
]
# Recommended: Split your filters into manageable batches
filter_batches = [
[
{"memcmp": {"offset": 0, "bytes": "data1"}},
{"memcmp": {"offset": 32, "bytes": "data2"}},
],
[
{"memcmp": {"offset": 64, "bytes": "data3"}},
{"memcmp": {"offset": 96, "bytes": "data4"}},
],
# Continue as needed...
]
results = []
# Execute the queries batch-by-batch
for batch in filter_batches:
response = client.get_program_accounts(program_id, filters=batch)
if response["result"]:
results.extend(response["result"])
print(f"Total retrieved accounts: {len(results)}")
1. Use Filters Wisely: leverage memcmp or dataSize to limit your search to only those accounts you really need.
2. Take Advantage of dataSlice for Pubkeys Only: if you don’t need the entire account data, skip the overhead and reduce network load.
3. Plan for Performance: if you handle very large volumes of accounts, consider a premium-level RPC service to get higher request limits and faster response times.
4. Retries and Timeouts: when processing large data sets, implement retry logic in case of errors and handle timeouts appropriately.
5. Monitor Connection Usage: as your project scales, evaluate the need for an RPC provider with scalable plans or gRPC support for higher performance.
Even though getProgramAccounts doesn’t offer a direct limiting or pagination parameter, you can control the data returned in several ways:
1. Filters (memcmp, dataSize, dataSlice): reduce the accountA data structure on Solana that holds tokens and state; acco… More set and data returned.
2. Manual pagination with getMultipleAccountsInfo: optimize processing and avoid overloading the node.
3. Choose a Reliable RPC Provider: critical to avoid strict limits or frequent failures.
Engineer. CEO of GS Node. Marketing Manager at Smithii.