Share your experiences with running rippled

mDuo13 · April 14, 2016

Rather than continue to derail TWarden's UNL security blanket thread, I decided it would be a good idea to share knowledge and collect information on people's experiences with running rippled. We can see what kinds of specs and settings people have had success or problems with, and go from there.

I'll start with my own experience:

I have a home desktop PC which I use for my daily computing needs. I built the computer in 2009 and have upgraded it slightly over the years. Since late January 2016, I've been running rippled as a validator more or less continuously.

System specs:

CPU: 2.66GHz Intel Core i7 "Bloomfield" CPU
RAM: 16GB Corsair XMS3 (what can I say, it was a Black Friday sale!)
Disks: System partition is on a 80GB Intel X-25M SSD. My rippled's database is on my 2x3TB 7200 RPM disks in software RAID1 using mdadm. I think they're Hitachi-branded?
OS: Arch Linux x86_64 (latest Linux kernel, which I've updated a few times in the time I've been running the validator)
Network: I'm wired to a cheapo 10/100MBit Netgear ethernet switch and use my house's residential broadband.

Settings:

I'm using the default settings from the rippled-example.cfg, except for adding my validation seed. The automatic online delete seems to work just fine.

Maintenance:

Basically, the only maintenance I do is to upgrade to a new version of rippled when one comes out. So far I haven't had any problems with disk space: rippled seems to purge old ledgers often enough to stay within a reasonable amount of disk usage.

Usage patterns:

I occasionally run RPC commands against rippled for testing / work purposes, but by and large the daemon is dedicated to validating. I do use the machine for my daily computing and I've noticed that the process tends to use up as much RAM as it can when my computer is idle, but it quickly yields it to other programs when I'm actively using the computer. There hasn't been any major adverse effect on my general computing performance.

Meanwhile, my validation agreement percentage has been hovering around 98% with few exceptions during that time.

celticwarrior72 · April 14, 2016

Very useful information.

Twarden · April 14, 2016

Notable Issues with my rippled experience included:

SSL configuration was nonfunctional for me until rippled 0.30.0
WSS configuration was nonfunctional for me until rippled 0.30.0
Online Delete does not seem to function properly
Before the deprecation of the apt repo, I experienced a ton of dependency troubles when choosing to build from source (the RPM replacement is great and rippled does perform just fine in my own experience on a DigitalOcean 20$ droplet on CentOS 6.7; if you have major issues building rippled on your favourite OS then just avoid the headache and use the repo)

My rippled validator was first hosted on a Virtual Private Server. The web host I was leasing my VPS from suggested that I migrate to a dedicated server as the disk I/O was causing instability for other customers on that server. When hosting the validator on a VPS, I was using the apt package until it was about to become deprecated. During this time I was having issues with the init script and had trouble with setting up WSS connections and it was not until version 0.30.0 was released before the latter was functional. The online delete functionality did seem to work fine when I was running rippled from the apt repository but when I began to build it from source I began to encounter disk space issues.

The VPS I ran which experienced instability had these specs:

OpenVZ VPS
E31240 Processor
3.30GHz
4 CPUs
8GB RAM + 8GB VSwap
80GB SSD
8TB Bandwidth
1Gbit Network Speed
Ubuntu 14.04.4 LTS

While I have operated my validator on a dedicated server I noticed a remarkable increase in agreement rate compared to operating on a VPS as I was able to maintain about 80-90% agreement (or very much less than that) due to the instability issue. My configuration for rippled may be improper (see below) but for some reason the online delete settings do not function. I have also noticed that using the cli I can command rippled the can_delete command using either an integer or now then receive a success message yet the disk space is never freed. I am going to write a simple bash script to prune the ledger data then restart rippled. I currently have 35GB of remaining disk space on the server before I must prune the ~3.5M ledgers on this server tomorrow. I will set up crontab during a time that I can observe rippled in operation while my experiment executes in lieu of the online delete functionality.

I do not notice any issues with RAM usage when starting applications to work in or when I start other background processes; For a time I was downloading the NXT blockchain on the server to offer some stats I was working on plus hosting a NXT Wallet in parallel to hosting rippled as a validator.

My current server specs are:

HP DL160 G6
2x Intel E5606
8 GB PC3-10600R
Intel 520/530 480GB SSD
RAID Controller: HP P410/256MB
2 TB Bandwidth
2Gbit Connection
Ubuntu 14.04.4 LTS

My Rippled.cfg file

Quote

{ "command": "log_level", "severity": "warning" }

# If ssl_verify is 1, certificates will be validated.
# To allow the use of self
# set to ssl_verify to 0.
[ssl_verify]
1

[ssl_verify_file]
/etc/apache2/sites-available/xagate.com.key

[ssl_verify_dir]
/etc/apache2/sites-available

[server]
port_peer
port_rpc
port_ws
port_wss

[port_peer]
port = 51235
ip = 127.0.0.1
protocol = peer

[port_rpc]
port = 51234
ip = 127.0.0.1
admin = 127.0.0.1
protocol = http

[port_rpc_admin_local]
port = 5005
ip = 127.0.0.1
admin = 127.0.0.1
protocol = http

[port_ws]
port = 51233
ip = 185.82.201.85
admin = 127.0.0.1
protocol = ws

[port_wss]
port = 48212
ip = 185.82.201.85
admin = 127.0.0.1
protocol = wss
ssl_key = /etc/apache2/sites-available/xagate.com.key2
ssl_cert = /etc/apache2/sites-available/xagate.com.crt2
ssl_chain = /etc/apache2/sites-available/xagate.com.bundle.crt

[peers_max]
50

[peer_private]
0

[sntp_servers]
time.windows.com
time.apple.com
time.nist.gov
pool.ntp.org

[node_size]
medium

[node_db]
type=nudb
path=/var/lib/rippled/db/nudb
online_delete=1900000
advisory_delete=1

[ledger_history]
1810000

[database_path]
/var/lib/rippled/db

[debug_logfile]
/var/log/rippled/debug.log

[rpc_startup]
{"command": "log_level", "severity": "warning"}

[ips]
r.ripple.com 51235
xagate.com 48212

[validators]
n949f75evCHwgyP4fPVgaHqNHxUVN15PsJEZ3B3HnXPcPjcZAoy7 RL1
n9MD5h24qrQqiyBC8aeqqCWvpiBiYQ3jxSr91uiDvmrkyHRdYLUj RL2
n9L81uNCaPgtUJfaHh89gmdvXKAmSt5Gdsw2g1iPWaPkAHW5Nm4C RL3
n9KiYM9CgngLvtRCQHZwgC2gjpdaZcCcbt3VboxiNFcKuwFVujzS RL4
n9LdgEtkmGB9E2h3K4Vp7iGUaKuq23Zr32ehxiU8FWY7xoxbWTSA RL5
n9M83NCHQqwKBCqmvTaMxWJTp8AQGp6fKGTGH8wcvQFkFsf2gg27 XAGATE

[validation_seed]
ssssss

[validation_quorum]
4

Edited April 14, 2016 by Twarden

Sukrim · April 15, 2016

I'm running rippled more or less constantly since summer 2013.

At the moment I have a dedicated machine at home running Manjaro Linux (it is easier to just look at a screen than to fire up an SSH session to see what's going on). I've started validating ledgers a while back too, though I don't really see the point so far, it was more of an experiment to see if it influences system load. The main goal of that machine is to get full history and have a server available locally that can query the whole set of RCL data that is currently publicly available (not just the latest X ledgers).

System specs:

CPU: Xeon E3-1230 V3
RAM: 32GB ECC something (whatever was cheapest at that time)
OS: Manjaro latest (rolling release based on Arch)
DISKS: 80 GB Intel for OS, 3 ~1 TB SSDs for data (3 different models with slightly different sizes though... and it looks like I'll soon need a fourth one)
NET: Until recently a 10 Mbit down/1 Mbit up DSL line, now trying out a 150/50 Mbit LTE connection. Connection in the LAN to the uplink is GBit ethernet.

Settings:

Mostly the example config, with the following changes (besides port numbers/admin settings for ports):

[node_size] huge
[node_db] NuDB
[sntp_servers] added the one from my university as first one in the list, since it is physically close
[ledger_history] full
[fetch_depth] 32 (sorry, but I'll only allow deeper fetches once I've acquired actual full history - see below why)
[validation_seed] s...

Maintenance:

I usually follow github quite closely and often run the beta releases. CPU is not an issue on my system anyways, so full recompilation only takes half a minute or so. Since there's NO way to have full history on spinning metal with any kind of useful performance, plopping in another 1 TB SSD every year got a bit expensive over time, this server contains quite a good portion of my disposable income. In the olden days rippled was a REAL RAM hog, nowadays I'd be more than happy if caching was actually taken up a notch (e.g. querying for a full ledger over RPC and immediately afterwards querying for the binary version of that same full ledger seems to hit the node store instead of a cache!).

Problems:

The biggest 2 issues seem to be disk load and network overhead. Another minor issue is recovering from the 24 hour disconnect (+ IP change) that happens on LTE because the external IP changes causing peers to drop and hurting my validation rate.

About disk load: I ~~wanted to try out~~ am running BTRFS (in JBOD disk pooling mode) as the underlying file system but ran into various issues at different times. The upside is that it is very easy to expand the storage, also if I now buy a 2 TB SSD I don't have to worry about matching sizes or losing expensive space. The downside is that I recently managed to crash my file system in a way that couldn't really recover from it and out of frustration and also to start from a clean state with the LTE connection decided to just re-sync from scratch (I still have a node database backup somewhere, but as it is from an older version of rippled it might contain broken data or unneeded records etc. and I'm not so sure if the import command really checks for correct data...). In the end it is at the moment only possible to use BTRFS as the underlying node store file system if you disable a lot of the things that make BTRFS nice...

Network overhead: A full database (NuDB with index, ledger.db, transactions.db(!)) is now probably getting closer and closer to 3 TB in size, it might even already have surpassed this size. My node is the only client on the LTE connection and so far has (in less than a month) caused close to 2 TB in downstream traffic (yes, the connection has no traffic limit). The number of ledgers synced at the moment though? Not even 2 million. This means to sync about a million ledgers, it takes about a TB in downstream traffic, while the nodes involved are only about a tenth of that (~100 GB for 1 million ledgers). I'm currently exploring options to trustlessly or at least only semi-trusted share parts of a node.db to be imported so that others can just download a large file, import it and be done with it instead of causing high resource usage on the few servers out there that even serve/have full history. Online_delete is nice for a validator (though I'd argue that a validator should just not store nodes at all and ONLY validate) but the network really needs nodes outside of Ripple Inc that also store and serve everything that remains from history (after all they were the ones losing the initial 32570 ledgers...). I would be very interested especially in the storage backend configuration of these nodes by the way!

Also it takes far too long for my taste to get historic ledgers, probably because there are not that many peers out there that even offer that data and asking them for a largish fetch pack means they have to dig deep into their databases (getting a recent full ledger on my node takes about 10 seconds if it hits the database, fetch packs of course just ask for subsets of that). That's why rippled seems to prefer asking for larger fetch packs which means even longer round trip times, more traffic wasted (since probably a lot of data does NOT change between ledgers) and higher load on the few poor full history nodes. There might be ways around this (e.g. transmitting diffs between full ledgers instead of everything or only transmitting leaf nodes and constructing the SHAmap locally - there could even be a combination of the 2, the difference in leaf nodes between the leaves of 2 full ledgers that are reasonably close to each other is a few kB at most) but at the moment it seems to be easier to throw hardware at the problem...

I might start again to note down IPs of peers that serve full history and manually add them to the config file, so I'm connected to a higher ratio of useful peers in the network - I'm especially baffled by the fact that rippled maintains long running connections to testnet nodes instead of dropping them off and trying to find more useful nodes on the net.

Edited April 15, 2016 by Sukrim
clarification

Hodor · April 15, 2016

16 hours ago, mDuo13 said:

RAM: 16GB Corsair XMS3 (what can I say, it was a Black Friday sale!)

This triggered my jealousy immediately. I'm looking for time to install Rippled on my Ubuntu VM machine with 3 GB allocated RAM.

Twarden · April 15, 2016

3 hours ago, Hodor said:

This triggered my jealousy immediately. I'm looking for time to install Rippled on my Ubuntu VM machine with 3 GB allocated RAM.

The wiki states that the minimum server requirements for rippled is 4GB of RAM.

@Sukrim I'd be interested in adding that list of peers to my configuration as well. Will you please add the full history IPs to the List of developer resources thread when you have the time?

I wrote this script to test with crontab today regarding my struggles with my online delete settings; if it works I will update this post to share a workaround for others who may be experiencing this issue.

Quote

#!/bin/sh
#Prunes the rippled ledger data in /var/lib/rippled/db and /var/lib/rippled/db/nudb

#End the rippled process
/home/rippled/build/rippled --conf /etc/rippled/rippled.cfg stop

#Prune the ledger data
rm /var/lib/rippled/db/state.*
rm /var/lib/rippled/db/ledger.db
rm /var/lib/rippled/db/wallet.db
rm /var/lib/rippled/db/random.seed
rm /var/lib/rippled/db/nudb/rippledb.* -r
rm /var/lib/rippled/db/nudb/rippledb.* -r

#Enter screen and restart the rippled process
screen -d -m bash "/home/rippled/build/rippled --conf /etc/rippled/rippled.cfg --net --start"

Edited April 15, 2016 by Twarden

nikb · April 15, 2016

49 minutes ago, Twarden said:

The wiki states that the minimum server requirements for rippled is 4GB of RAM.

@Sukrim I'd be interested in adding that list of peers to my configuration as well. Will you please add the full history IPs to the List of developer resources thread when you have the time?

I wrote this script to test with crontab today regarding my struggles with my online delete settings; if it works I will update this post to share a workaround for others who may be experiencing this issue.

Removing wallet.db will cause your server to change its node identity. This isn't a huge deal, but some config options related to clustering (which you probably don't use) could stop working, and it's likely that some of the tools that track the network and how servers connect (e.g. https://peers.ripple.com) will treat your server as a new server when that happens.

My suggestion is to specify your node identity in the configuration file; it's simple; simply generate a new validation key, and then save it in the config under the [node_seed] option.

Sukrim · April 15, 2016

51 minutes ago, Twarden said:

I'd be interested in adding that list of peers to my configuration as well. Will you please add the full history IPs to the List of developer resources thread when you have the time?

Just look at the output of the "peers" API call and check the "complete_ledgers" field. Looking at the current IPs it seems that they chage from time to time, maybe as machines get rebooted on AWS when updating the software and get assigned a new dynamic IP - so a public list would not make that much sense I fear.

Your script tries to delete a file named "state.*" that I've never heard of, also you are not pruning, but outright completely deleting the whole node database (but not the SQLite transaction.db?)... Maybe this is what confuses your rippled enough that online_delete doesn't work? On the other hand you might have misunderstood what online_delete does or how much history it will fetch until it comes into effect. With this script you'll essentially reset rippled (except the peerfinder and transaction databases) causing it to re-sync from scratch and (with the settings posted above) start fetching history down to ledger 1.810.000 (which is from the end of August 2013).

Edited April 15, 2016 by Sukrim

Twarden · April 15, 2016

I read through JoelKatz's response to my question about online delete an I realize that I misunderstood what he meant with settings these values. I haven't experienced issues with leaving the transaction.db intact besides the occasional hiccup where a warning error is thrown that we are missing a valid transaction in the built database which then sorts itself out and I kept my wallet.db intact as well. I first tested my script by running it manually and it failed to restart the service so I set up the script to execute with the at command. The first time I ran it with a change to my script to restart rippled it failed again. The last attempt I made using at was by writing a line that would attach the rippled screen window I have open under the screen process and execute the start command. I've reset my ledger_history to 2.5M ledgers and my online_delete setting to 3.5M ledgers, so I will wait to see if the disk space will fill entirely with these new settings. If all else fails, I will continue to tweak this script to be executed with at to deal with the ledger data:

Quote

#!/bin/sh
#Prunes the rippled ledger data in /var/lib/rippled/db and /var/lib/rippled/db/nudb

#End the rippled process
/home/rippled/build/rippled --conf /etc/rippled/rippled.cfg stop

#Prune the ledger data
rm /var/lib/rippled/db/state.*
rm /var/lib/rippled/db/ledger.db
rm /var/lib/rippled/db/random.seed
rm /var/lib/rippled/db/nudb/rippledb.* -r

#Restart the server
screen -d -r -S rippled /home/rippled/build/rippled --conf /etc/rippled/rippled.cfg --net --start

nikb · April 17, 2016

On 4/15/2016 at 0:53 PM, Twarden said:

I read through JoelKatz's response to my question about online delete an I realize that I misunderstood what he meant with settings these values. I haven't experienced issues with leaving the transaction.db intact besides the occasional hiccup where a warning error is thrown that we are missing a valid transaction in the built database which then sorts itself out and I kept my wallet.db intact as well. I first tested my script by running it manually and it failed to restart the service so I set up the script to execute with the at command. The first time I ran it with a change to my script to restart rippled it failed again. The last attempt I made using at was by writing a line that would attach the rippled screen window I have open under the screen process and execute the start command. I've reset my ledger_history to 2.5M ledgers and my online_delete setting to 3.5M ledgers, so I will wait to see if the disk space will fill entirely with these new settings. If all else fails, I will continue to tweak this script to be executed with at to deal with the ledger data:

By the way, removing random.seed doesn't do much: it's a small cache of entropy, maintained across server restarts so that the server can feed it into its secure random number generation pool immediately. Removing it won't harm anything (we also grab entropy from the operating system) but leaving it there won't make a difference.

T8493 · April 17, 2016

Yesterday I set-up rippled instance on the Hyper-V virtual machine.

Current guest machine specifications: CentOS, 4 CPU, 4 GB RAM, virtual disk is stored on Samsung SSD 840 EVO. Physical machine is i7 with 8GB RAM, one fast spinning disk + one SSD disk, Windows 10 Professional.

I followed official instructions.

After spending 8+ hours tweaking different settings, I came to a rippled configuration that somewhat works. Previous configurations didn't work, because rippled was either using too much memory, too much disk (disk queue was too long or it was constantly reading 30-70MB/s) or it didn't synchronize (it stayed in "connected" state for too long) for an unknown reason.

Some findings:

spinning disk + RocksDB combination doesn't work; even if rippled may work at the beginning, it will soon lose synchronization and won't be able to synchronize again,
so use SSD + nudb instead and don't even think of using a spinning disk,
8GB RAM in host machine is not enough, one probably needs at least 12-16GB,
there was a big difference between 3.5GB and 4GB RAM assigned to a guest machine. I don't know why. When I was using 3.5GB RAM instead of 4GB, the disk worked constantly (probably due to swapping). However, currently Hyper-V Manager reports only 2.3GB assigned memory (the maximum memory is still set to 4GB).
rippled can be very "fragile"; for example, when windows started downloading software updates, rippled lost synchronization. When Windows Defender started checking files, rippled also lost synchronization. When I was doing anything else than browsing (compiling/running tests), rippled lost synchronization,
there is no difference in memory usage if I change node_size (currently it is set to tiny),
rippled can behave very differently after it stabilizes and runs for a while. So one should reject certain rippled configuration only after it runs for a while,
I didn't see any difference after I enabled validation.

Dstat currently reports:

4MB disk write every 4 seconds, almost no disk reads,
network usage: 50-100kB/s received, 20-50 kB/s sent,
CPU is 95-99% idle.

This is good example of why I think running rippled instance could be expensive. I needed 8+ hours just for the initial setup.

Question: is the "proposing" server state included in the "full" server state that is reported by the "rippled server_state" command under the "state_accounting" property?

Edited April 17, 2016 by T8493

nikb · April 17, 2016

3 hours ago, T8493 said:

Yesterday I set-up rippled instance on the Hyper-V virtual machine.

Current guest machine specifications: CentOS, 4 CPU, 4 GB RAM, virtual disk is stored on Samsung SSD 840 EVO. Physical machine is i7 with 8GB RAM, one fast spinning disk + one SSD disk, Windows 10 Professional.

I followed official instructions.

After spending 8+ hours tweaking different settings, I came to a rippled configuration that somewhat works. Previous configurations didn't work, because rippled was either using too much memory, too much disk (disk queue was too long or it was constantly reading 30-70MB/s) or it didn't synchronize (it stayed in "connected" state for too long) for an unknown reason.

Some findings:

spinning disk + RocksDB combination doesn't work; even if rippled may work at the beginning, it will soon lose synchronization and won't be able to synchronize again,

so use SSD + nudb instead and don't even think of using a spinning disk,

8GB RAM in host machine is not enough, one probably needs at least 12-16GB,

there was a big difference between 3.5GB and 4GB RAM assigned to a guest machine. I don't know why. When I was using 3.5GB RAM instead of 4GB, the disk worked constantly (probably due to swapping). However, currently Hyper-V Manager reports only 2.3GB assigned memory (the maximum memory is still set to 4GB).

rippled can be very "fragile"; for example, when windows started downloading software updates, rippled lost synchronization. When Windows Defender started checking files, rippled also lost synchronization. When I was doing anything else than browsing (compiling/running tests), rippled lost synchronization,

there is no difference in memory usage if I change node_size (currently it is set to tiny),

rippled can behave very differently after it stabilizes and runs for a while. So one should reject certain rippled configuration only after it runs for a while,

I didn't see any difference after I enabled validation.

Dstat currently reports:

4MB disk write every 4 seconds, almost no disk reads,

network usage: 50-100kB/s received, 20-50 kB/s sent,

CPU is 95-99% idle.

This is good example of why I think running rippled instance could be expensive. I needed 8+ hours just for the initial setup.

Question: is the "proposing" server state included in the "full" server state that is reported by the "rippled server_state" command under the "state_accounting" property?

Thanks for detailing your findings here. I'm a bit surprised to hear that your machine needed 12-16GB - I'm running a server with half the memory (6GB) and am not really encountering any problems.

Running inside a VM should be fine provided the hypervisor provides good disk I/O performance and the underlying disk is an SSD. Based on your comment about fragility when other things are going on (I assume on the host machine) then I suspect the issues you encountered mostly trace back to I/O bottlenecks, but it's hard to point a finger conclusively.

It's true that on a fresh startup, with no databases, it can take a bit to sync up to the network (because rippled is trying to hit a moving target: it's asking for fetch packs to get the "current state" of the network, but that state changes all the time) depending on the number of peers you have, the network load and your Internet connection. I've seen it take a couple of minutes, but never more than that, really.

Generally, your experience seems atypical to me. I've had no issues running the configuration file from https://github.com/ripple/rippled/blob/develop/doc/rippled-example.cfg after changing the paths (to reflect my machine) and opting to use NuDB instead of RocksDB. I'm not exaggerating when I tell you that, using the RPM package, it takes me less than 10 minutes to bring a server up, and that creating a user account so I don't run the server as root.

I'd like to hear more about what configurations you tried and rejected; if you feel comfortable posting here, then please do as it may also benefit others. If you'd prefer something more private, you can always e-mail me (nikb@ripple.com). I want to try and trace the root cause of the issues you've encountered and see if we can address them in the code.

To answer your last question, proposing implies full: it's a state unique to validators - it means that not only is the server caught up to the network, but that it is also participating in the consensus process.

T8493 · April 17, 2016

4 hours ago, nikb said:

I'm a bit surprised to hear that your machine needed 12-16GB - I'm running a server with half the memory (6GB) and am not really encountering any problems.

My machine would need 12GB RAM if I wanted to use it for anything else than just browsing, e.g. compilation, running tests, running MS SQL Server, etc.

Quote

Running inside a VM should be fine provided the hypervisor provides good disk I/O performance and the underlying disk is an SSD. Based on your comment about fragility when other things are going on (I assume on the host machine) then I suspect the issues you encountered mostly trace back to I/O bottlenecks, but it's hard to point a finger conclusively.

Yes, on the host machine. But these tasks were just regular maintenance tasks that are run periodically by Windows.

Quote

I'd like to hear more about what configurations you tried and rejected; if you feel comfortable posting here, then please do as it may also benefit others. If you'd prefer something more private, you can always e-mail me (nikb@ripple.com). I want to try and trace the root cause of the issues you've encountered and see if we can address them in the code.

I don't mind sharing configurations here, but I doubt I will be able to remember all the details.

First set of configurations: 2 CPU, 2-3.5GB RAM, virtual disk was on the spinning disk (spinning disk is 7200 RPM and recently defragmented), started with original rippled.cfg. Worked for some time, but then the disk usage became problematic. Tried changing 2 CPU->4 CPU, node_size medium -> tiny.

Second set of configurations: moved swap file on the host computer to a practically new SSD disk. Enabled ReadyBoost.

Third set of configurations: moved virtual disk to this new SSD disk. VM has 4 CPUs, 3.5GB RAM. Changed RocksDB->NuDB. Worked better, but I still noticed extremely heavy disk usage. Tried changing node_size to medium, small and tiny.

Fourth set of configurations: main change was 3.5GB->4GB RAM. This postponed VM swap usage to a later time, because swapping probably still occurs. Currently, top shows that VM swap size is around 100MB after 12 hours of uptime.

I think there was some kind of memory problem. My interpretation: rippled estimated it could use certain amount of memory, but in reality this amount was too large and VM OS started swapping memory to a virtual disk (dstat reported large numbers in "paging in" and "paging out" columns). There was no shortage of "real" RAM in the host machine. However, this swapping seriously overloaded I/O subsystem on the host machine (on spinning disk read queue length was constantly more than 10 or 20 - as reported by Windows Performance Monitor).

Currently, dstat in VM reports almost no disk reads, only occasional writes. When I encountered heavy disk usage, most of the disk activity consisted of reads and not writes.

I tried to limit amount of memory rippled can use by setting node_size to tiny, but this didn't have any noticeable effect.

EDIT: I've been observing VM behaviour under different conditions and I think it could be some kind of problem related to Hyper V dynamic memory allocation. When Hyper V readjusts assigned memory, rippled doesn't detect this and it can end up using disproportionately large amount of memory.

Two examples of such behaviour:

Option (Startup) RAM is set to 2 GB in Hyper V Manager => rippled and other processes use only 1 GB Memory => 0.7 GB of memory is used as a disk cache. In this case assigned memory always stays the same (2 GB).

Option "(Startup) RAM" is set to 4 GB => after a while Hyper V reduces assigned memory from 4 GB to around 2.6 GB, but rippled and other processes still use 2.4 GB. However, the "free" command still reports 1 GB of RAM being used as cache, although in reality there's no 1 GB available for disk cache (because Hyper V assigned only 2.6 GB RAM to this VM and 2.4 GB of memory is reported as used).

Maybe these tools don't report correct numbers or maybe I don't understand what these numbers mean.

Edited April 17, 2016 by T8493

nikb · April 17, 2016

Using a spinning disk is likely not going to cut it - especially once you factor in the virtualization overhead (and it's not small, especially if the virtual disk image is, itself, stored on a file system instead of directly occupying an entire dedicated disk or a partition. It doesn't take a lot to imagine I/O performance being abysmal in such a setup.

Onwards to RAM: you can't expect rippled - or really any software - to know what the hypervisor is doing with RAM outside the virtual machine; that's the whole point of a VM; the guest is happy as a clam, running on virtual hardware as the host is doing who-knows-what. Generally, the amount of physical RAM available on a machine is fixed for the duration of execution. There are some exotic architectures where this isn't the case, but those require operating system support and try to make this transparent to actual applications.

As for the 1GB cache: the O/S will aggressively use RAM as a cache. This makes sense: you don't want RAM to be sitting empty, it does you no good. So if you have, say, 4GB of RAM but the programs you run only need 1.2GB, then O/S will do its best to use the remaining 2.8GB. And what can it use it as other than cache? If the guest came under memory pressure, it would reduce that 1GB rapidly by just throwing away large chunks of the cache.

I don't use Hyper-V so it's hard to know what it does or how to configure it. My recommendation would be to (a) allocate a fixed amount of RAM for your VM and not tweak it - 4GB should be plenty and (b) make sure the disk you're using in the guest is preallocated, otherwise you will encounter weird performance issues as the host attempts to grow the size of disk's backing store to account for actual use.

T8493 · April 17, 2016

49 minutes ago, nikb said:

I don't use Hyper-V so it's hard to know what it does or how to configure it. My recommendation would be to (a) allocate a fixed amount of RAM for your VM and not tweak it - 4GB should be plenty and (b) make sure the disk you're using in the guest is preallocated, otherwise you will encounter weird performance issues as the host attempts to grow the size of disk's backing store to account for actual use.

Disk was preallocated. 4 GB of fixed amount of RAM is a little bit too much for a host machine that has only 8 GB. That's why I suggested one should have 12-16GB.

However, I'm now testing with only 1.5GB of fixed RAM and it works fine for now (without excessive swapping and it stays synchronized). EDIT 2: excessive swapping started. 1.5GB is not enough.

EDIT: Is it possible to tell rippled not to use more than X MB amount of memory regardless of amount of assigned/available RAM? I tried to achieve this with node_size, but it doesn't work.

Edited April 17, 2016 by T8493

Sign In

Share your experiences with running rippled

Recommended Posts

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

mDuo13

Sukrim

Twarden

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Please sign in to comment

Popular Topics

Forum Statistics