Jump to content
JoelKatz

Suggestion: Investigate/reduce memory consumption

Recommended Posts

This is one of the areas I'd love to see prioritized most. (In part because it doesn't require any work from me! B)) But I think bringing down the hardware requirements—or even just, keeping them where they are as commodity hardware specs improve—will do a lot to facilitate the growth and decentralization of the core peer-to-peer network. If the next Raspberry Pi could be a reliable validator, imagine just how great that would be!

Share this post


Link to post
Share on other sites

I think a reasonable first step would be to define a clear target - e.g. a RaspberryPi 4 or a certain cheap common cloud hoster instance type (8 GB RAM, 2 vCPU, 20 GB SSD disk, 100 MBit/s network, Ubuntu 18.04) that will be targeted as being able to run a validator for up to x times the current workload (~1-2 million accounts, ~1-2 million transactions per day with a common split of transaction types). There are no reproducible benchmarks for resource usage that I could find so far and recommendations seem to be a bit overcautious (16-32 GB RAM for a validator that shouldn't ever have to answer to any user facing API calls... really?).

Share this post


Link to post
Share on other sites

Can we try to estimate what every item saved in the ledger is actually consuming in terms of memory?

I did a few calculation a while ago, and for example if I don't remember well, the raw data for a trust line is around 60-80B of data.

Can we also have some numbers of the actual items on the latest ledger? How many trust lines, how many offers, how many escrows, and so on. I'm too lazy and busy to do it myself :).

Share this post


Link to post
Share on other sites

That's a bit difficult to say, since rippled stores that stuff compressed with lz4 usually. I could check a ledger state in "binary" format (afaik that would be the uncompressed canonical encoding of ledger objects except inner nodes) for these things if that helps. How much that means in actual RAM and disk used is a different question - that depends a lot on architecture and other things (e.g. Rust makes it easier to write safe zero-copy data structures). I guess the canonical encoding should give a reasonable lower bound in many cases (maybe there could be smaller pointers used in some fields instead of full hashes, so it could be a bit smaller in memory) about what could be reasonable or feasible, so for example one can rule out definitely that a 1 GB RAM RPi can ever run a server without deleteable accounts because the state is too large.

Share this post


Link to post
Share on other sites
8 hours ago, Sukrim said:

That's a bit difficult to say, since rippled stores that stuff compressed with lz4 usually.

Is the latest ledger compressed with lz4?

So every query of rippled has to decompress and get the data?

8 hours ago, Sukrim said:

I guess the canonical encoding should give a reasonable lower bound in many cases

This.

Share this post


Link to post
Share on other sites
2 hours ago, tulo said:

Is the latest ledger compressed with lz4?

So every query of rippled has to decompress and get the data?

All data in the node database is compressed usually.

If the request can't be answered from in-memory data, you need to query that database, yes.

Share this post


Link to post
Share on other sites
On 10/2/2019 at 8:38 AM, Sukrim said:

I think a reasonable first step would be to define a clear target - e.g. a RaspberryPi 4 or a certain cheap common cloud hoster instance type (8 GB RAM, 2 vCPU, 20 GB SSD disk, 100 MBit/s network, Ubuntu 18.04) that will be targeted as being able to run a validator for up to x times the current workload (~1-2 million accounts, ~1-2 million transactions per day with a common split of transaction types). There are no reproducible benchmarks for resource usage that I could find so far and recommendations seem to be a bit overcautious (16-32 GB RAM for a validator that shouldn't ever have to answer to any user facing API calls... really?).

Recommending hardware for production use is always tricky and one needs to balance a number of factors.

Let’s say that 8 GB of RAM is enough today. How long will it be enough for? Will it be enough if there’s a sudden spike of volume? Will boxes be under memory pressure 3 or 6 months from now?

The cost of RAM is cheap and it’s better to be conservative and allow a big buffer, than to end up under memory pressure at an inopportune time.

To be clear, I’m not opposed to your comments about defining a target re: resources and then trying to hit it. But I also don’t think it’s realistic to target a Raspberry Pi or to recommend hardware that has enough capacity for “today” but not for “tomorrow”.

Share this post


Link to post
Share on other sites
36 minutes ago, nikb said:

To be clear, I’m not opposed to your comments about defining a target re: resources and then trying to hit it. But I also don’t think it’s realistic to target a Raspberry Pi or to recommend hardware that has enough capacity for “today” but not for “tomorrow”.

A few years ago 2Gb were enough. I thought that 16Gb were enough for some time, but then something happened and it jumped to 8/10Gb. How long till it will jump again and make 16Gb useless? It's hard to say if we don't have any idea about what is using that memory.

Also for Ripple company, how can they provide specs for hardware?

Maybe also reserves should be dynamic as transaction fees, i.e. increase when the network is under heavy resources usage.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×
×
  • Create New...