Jump to content
Professor Hantzen

History Sharding & Backfill Speed

Recommended Posts

The History Sharding documentation states: "...acquiring shards begins after synchronizing with the network and backfilling ledger history to the configured number of recent ledgers."

Is it correct to interpret this as meaning backfill speed for a given server would not be improved by enabling history sharding on that server?  The statement suggests only one acquisition process may be active at a time.  I was thinking in the case of wanting to speed up backfill, someone could enable a shard store of size larger than the total history.  But would this fill any faster than ordinary acquisition?  Presumably those ledgers would be coming from the same source either way?

Share this post


Link to post
Share on other sites
19 minutes ago, Professor Hantzen said:

Is it correct to interpret this as meaning backfill speed for a given server would not be improved by enabling history sharding on that server?

Improved with respect to what?

Because before sharding there was no backfilling at all. The servers were just saving the history from the moment they connected so they didn't have the past history at all.

Share this post


Link to post
Share on other sites
39 minutes ago, tulo said:

Improved with respect to what?

Because before sharding there was no backfilling at all. The servers were just saving the history from the moment they connected so they didn't have the past history at all.

As I understand it, the shard store and the server ledger history are kept in two separate db's.  When I read the documentation, I can't find a specific reference that shows these two databases will interact, other than across different nodes (though of course it would make sense they should also interact locally).  The specifics of the ordering in the quoted statement suggests to me that these two stores will never interact because if the server in question is configured to acquire all ledgers, it will first complete that process before launching the sharding process (and accessing its db). Ie, it will never launch the sharding process until it already has all the ledgers.

Share this post


Link to post
Share on other sites
6 minutes ago, Professor Hantzen said:

As I understand it, the shard store and the server ledger history are kept in two separate db's.  When I read the documentation, I can't find a specific reference that shows these two databases will interact, other than across different nodes (though of course it would make sense they should also interact locally).  The specifics of the ordering in the quoted statement suggests to me that these two stores will never interact because if the server in question is configured to acquire all ledgers, it will first complete that process before launching the sharding process (and accessing its db). Ie, it will never launch the sharding process until it already has all the ledgers.

I think the two databases are separated, i.e. one doesn't check if the other has already the ledgers, but when retrieving historical ledgers, rippled will check in both the shard and the normal history.

There is also this quote:

Quote

The ledger store history size should at minimum be twice the ledgers per shard, due to the fact that the current shard may be chosen to be stored and it would be wasteful to reacquire that data.

but it is not clear to me.

Share this post


Link to post
Share on other sites

Also this:

Quote

The retrieval process begins with the server checking for the data locally. For data that is not available, the server requests data from its peer rippled servers. Those servers that have the data available for the requested period respond with their history. The requesting server combines those responses to create the shard. The shard is complete when it contains all the ledgers in a specific range.

 

Share this post


Link to post
Share on other sites

×