Jump to content
tulo

CPU usage rippled

Recommended Posts

Posted (edited)

Is it normal that an intel i7, 6th gen, 3.5Ghz is always between 110% and 350% usage only for rippled (not validating)?

And there are around 30-50 transactions per ledger. What the hell is the CPU doing with 30 transactions every 4 seconds?

Do you have similar numbers?

Edited by tulo

Share this post


Link to post
Share on other sites

Maybe some database compression or other stuff? I stopped my node for now to have a bit more bandwidth for home office, but I didn't see any relevant CPU usage from rippled so far.

Share this post


Link to post
Share on other sites

When I was running my rippled servers, CPUs were only taking some 5-10%. Something else must be running in background. Or are you experimenting with your own builds?

Share this post


Link to post
Share on other sites
17 hours ago, Duke67 said:

When I was running my rippled servers, CPUs were only taking some 5-10%. Something else must be running in background. Or are you experimenting with your own builds?

Nope...it's latest rippled.

I also don't have any weird config I think.

Share this post


Link to post
Share on other sites

This is my rippled.cfg...it seems standard to me

[server]
port_rpc_admin_local
port_peer
port_ws_admin_local

[port_rpc_admin_local]
port = 5005
ip = 127.0.0.1
admin = 127.0.0.1
protocol = http

[port_peer]
port = 51235
ip = 0.0.0.0
protocol = peer

[port_ws_admin_local]
port = 6006
ip = 127.0.0.1
admin = 127.0.0.1
protocol = ws

 

[node_size]
small


[node_db]
type=NuDB
path=/var/lib/rippled/db/nudb
open_files=2000
filter_bits=12
cache_mb=256
file_size_mb=8
file_size_mult=2
online_delete=512
advisory_delete=0

 

[database_path]
/var/lib/rippled/db


[debug_logfile]
/var/log/rippled/debug.log

[sntp_servers]
time.windows.com
time.apple.com
time.nist.gov
pool.ntp.org


[validators_file]
validators.txt


[rpc_startup]
{ "command": "log_level", "severity": "warning" }


[ssl_verify]
1
 

Share this post


Link to post
Share on other sites

That seems pretty standard. Some spikes aren’t unexpected, but if you see that much usage constantly, something is off.

 

Can you show us the output server_info please?

Share this post


Link to post
Share on other sites
4 hours ago, nikb said:

Can you show us the output server_info please?

{
   "result" : {
      "info" : {
         "build_version" : "1.5.0",
         "complete_ledgers" : "54871802-54872172",
         "hostid" : "XXX",
         "io_latency_ms" : 1,
         "jq_trans_overflow" : "0",
         "last_close" : {
            "converge_time_s" : 3.061,
            "proposers" : 36
         },
         "load" : {
            "job_types" : [
               {
                  "avg_time" : 4,
                  "job_type" : "ledgerRequest",
                  "peak_time" : 39,
                  "per_second" : 3
               },
               {
                  "avg_time" : 6,
                  "job_type" : "untrustedProposal",
                  "peak_time" : 52,
                  "per_second" : 27,
                  "waiting" : 52
               },
               {
                  "avg_time" : 35,
                  "job_type" : "ledgerData",
                  "peak_time" : 1103,
                  "per_second" : 2
               },
               {
                  "avg_time" : 2,
                  "in_progress" : 1,
                  "job_type" : "clientCommand",
                  "peak_time" : 38,
                  "per_second" : 9,
                  "waiting" : 5
               },
               {
                  "avg_time" : 45,
                  "in_progress" : 3,
                  "job_type" : "transaction",
                  "peak_time" : 790,
                  "per_second" : 13
               },
               {
                  "avg_time" : 4,
                  "in_progress" : 1,
                  "job_type" : "batch",
                  "peak_time" : 225,
                  "per_second" : 6
               },
               {
                  "avg_time" : 31,
                  "job_type" : "advanceLedger",
                  "peak_time" : 526,
                  "per_second" : 12
               },
               {
                  "avg_time" : 13,
                  "job_type" : "fetchTxnData",
                  "peak_time" : 279,
                  "per_second" : 6
               },
               {
                  "avg_time" : 67,
                  "job_type" : "trustedValidation",
                  "peak_time" : 848,
                  "per_second" : 11
               },
               {
                  "in_progress" : 1,
                  "job_type" : "acceptLedger"
               },
               {
                  "avg_time" : 15,
                  "job_type" : "trustedProposal",
                  "peak_time" : 301,
                  "per_second" : 14
               },
               {
                  "avg_time" : 101,
                  "job_type" : "heartbeat",
                  "peak_time" : 189
               },
               {
                  "job_type" : "peerCommand",
                  "peak_time" : 2,
                  "per_second" : 850
               },
               {
                  "job_type" : "processTransaction",
                  "peak_time" : 1,
                  "per_second" : 13
               },
               {
                  "job_type" : "SyncReadNode",
                  "peak_time" : 45,
                  "per_second" : 11711
               },
               {
                  "job_type" : "AsyncReadNode",
                  "peak_time" : 18,
                  "per_second" : 426
               },
               {
                  "job_type" : "WriteNode",
                  "peak_time" : 25,
                  "per_second" : 255
               }
            ],
            "threads" : 6
         },
         "load_factor" : 1,
         "peer_disconnects" : "0",
         "peer_disconnects_resources" : "0",
         "peers" : 10,
         "pubkey_node" : "XXX",
         "pubkey_validator" : "none",
         "server_state" : "full",
         "server_state_duration_us" : "7370012",
         "state_accounting" : {
            "connected" : {
               "duration_us" : "456934932",
               "transitions" : 1
            },
            "disconnected" : {
               "duration_us" : "3593089",
               "transitions" : 1
            },
            "full" : {
               "duration_us" : "523765505",
               "transitions" : 8
            },
            "syncing" : {
               "duration_us" : "31406282",
               "transitions" : 8
            },
            "tracking" : {
               "duration_us" : "51",
               "transitions" : 8
            }
         },
         "time" : "2020-Apr-18 08:14:42.000574 UTC",
         "uptime" : 1015,
         "validated_ledger" : {
            "age" : 8,
            "base_fee_xrp" : 1e-05,
            "hash" : "7BE1535B562B879C64631188ED61A640207DF78E670318C18F5EC7A01657E5F4",
            "reserve_base_xrp" : 20,
            "reserve_inc_xrp" : 5,
            "seq" : 54872172
         },
         "validation_quorum" : 29,
         "validator_list" : {
            "count" : 1,
            "expiration" : "2020-Jun-02 00:00:00.000000000 UTC",
            "status" : "active"
         }
      },
      "status" : "success"
   }
}

 

Share this post


Link to post
Share on other sites

I post the results again, because the old one was only a few seconds after restart...the following is after days running.

{
   "result" : {
      "info" : {
         "build_version" : "1.5.0",
         "complete_ledgers" : "55162159-55162719",
         "hostid" : "xxx",
         "io_latency_ms" : 1,
         "jq_trans_overflow" : "0",
         "last_close" : {
            "converge_time_s" : 3.675,
            "proposers" : 36
         },
         "load" : {
            "job_types" : [
               {
                  "avg_time" : 171,
                  "job_type" : "ledgerRequest",
                  "peak_time" : 1100,
                  "per_second" : 2
               },
               {
                  "avg_time" : 66,
                  "job_type" : "untrustedProposal",
                  "peak_time" : 749,
                  "per_second" : 30
               },
               {
                  "avg_time" : 37,
                  "job_type" : "ledgerData",
                  "peak_time" : 639,
                  "per_second" : 3
               },
               {
                  "avg_time" : 3,
                  "in_progress" : 2,
                  "job_type" : "clientCommand",
                  "peak_time" : 91,
                  "per_second" : 4
               },
               {
                  "avg_time" : 54,
                  "job_type" : "transaction",
                  "peak_time" : 669,
                  "per_second" : 3
               },
               {
                  "avg_time" : 13,
                  "job_type" : "batch",
                  "peak_time" : 342,
                  "per_second" : 1
               },
               {
                  "avg_time" : 34,
                  "job_type" : "advanceLedger",
                  "peak_time" : 459,
                  "per_second" : 8
               },
               {
                  "avg_time" : 18,
                  "job_type" : "fetchTxnData",
                  "peak_time" : 790,
                  "per_second" : 3
               },
               {
                  "avg_time" : 122,
                  "job_type" : "trustedValidation",
                  "peak_time" : 915,
                  "per_second" : 8
               },
               {
                  "in_progress" : 1,
                  "job_type" : "acceptLedger"
               },
               {
                  "avg_time" : 34,
                  "job_type" : "trustedProposal",
                  "peak_time" : 423,
                  "per_second" : 13
               },
               {
                  "in_progress" : 1,
                  "job_type" : "sweep"
               },
               {
                  "avg_time" : 173,
                  "job_type" : "heartbeat",
                  "peak_time" : 663
               },
               {
                  "job_type" : "peerCommand",
                  "peak_time" : 13,
                  "per_second" : 535
               },
               {
                  "job_type" : "processTransaction",
                  "per_second" : 3
               },
               {
                  "job_type" : "SyncReadNode",
                  "peak_time" : 346,
                  "per_second" : 5445
               },
               {
                  "job_type" : "AsyncReadNode",
                  "peak_time" : 5,
                  "per_second" : 1322
               },
               {
                  "job_type" : "WriteNode",
                  "peak_time" : 16,
                  "per_second" : 1993
               }
            ],
            "threads" : 6
         },
         "load_factor" : 1,
         "peer_disconnects" : "439",
         "peer_disconnects_resources" : "0",
         "peers" : 10,
         "pubkey_node" : "xxx",
         "pubkey_validator" : "none",
         "server_state" : "full",
         "server_state_duration_us" : "1314268034",
         "state_accounting" : {
            "connected" : {
               "duration_us" : "13697771492",
               "transitions" : 15030
            },
            "disconnected" : {
               "duration_us" : "3928594039",
               "transitions" : 28
            },
            "full" : {
               "duration_us" : "1092198327089",
               "transitions" : 21439
            },
            "syncing" : {
               "duration_us" : "18600613786",
               "transitions" : 6485
            },
            "tracking" : {
               "duration_us" : "4365361939",
               "transitions" : 21444
            }
         },
         "time" : "2020-May-01 10:37:37.054364 UTC",
         "uptime" : 1132790,
         "validated_ledger" : {
            "age" : 6,
            "base_fee_xrp" : 1e-05,
            "hash" : "41C71CA868653CCB19475EED1253267632F31734A0772E5E79373F625F64E5CB",
            "reserve_base_xrp" : 20,
            "reserve_inc_xrp" : 5,
            "seq" : 55162719
         },
         "validation_quorum" : 29,
         "validator_list" : {
            "count" : 1,
            "expiration" : "2020-Jun-02 00:00:00.000000000 UTC",
            "status" : "active"
         }
      },
      "status" : "success"
   }
}

 

Share this post


Link to post
Share on other sites

Yikes, that is a lot of transitions in/out of the full state—an average of once every ~53 seconds! Sounds to me like your hardware (or network?) might not be able to keep up with the ledger.

Would be nice to know what kind of disks, RAM, and network connection the server is on—gotta get more datapoints on what's sufficient and what's not. Also, what [node_size] setting do you have in the config file? Also, are you using this machine for a bunch of other stuff, or is it dedicated hardware?

Share this post


Link to post
Share on other sites
Posted (edited)
On 5/8/2020 at 9:16 PM, mDuo13 said:

Yikes, that is a lot of transitions in/out of the full state—an average of once every ~53 seconds! Sounds to me like your hardware (or network?) might not be able to keep up with the ledger.

Would be nice to know what kind of disks, RAM, and network connection the server is on—gotta get more datapoints on what's sufficient and what's not. Also, what [node_size] setting do you have in the config file? Also, are you using this machine for a bunch of other stuff, or is it dedicated hardware?

  • Network is GPON FFTH 1000 Mb/s download and 200 Mb/s upload, with very low ping. I don't think it's the cause. It disconects sometimes (3-4 times a day).
  • [node_size] is "small"
  • 16Gb RAM DDR4 2133MHz. Single bank.
  • Disk is SSD. 250Gb. Sata III. M.2.
  • Almost dedicated hardware. Only running some minor things with low cpu and memory usage.
  • CPU is I7-7567U. Not the best i7 but not the worst.

I think it's not able to keep up with the ledger because it's requesting too much CPU...sometimes it goes full throttle :).

Edited by tulo

Share this post


Link to post
Share on other sites

Hm, very odd. Those specs sound like they should be sufficient, but obviously something is wrong for it to be falling out of sync that often. Unfortunately I don't know where to look next.

Share this post


Link to post
Share on other sites
Posted (edited)
On 5/12/2020 at 11:00 PM, Sukrim said:

The 512 ledger online_delete seems very short..

Yeah, I have it to try reduce the requirements. Shouldn't it? At least some RAM or SSD.

Edited by tulo

Share this post


Link to post
Share on other sites

What I see in the logs is A LOT of "LoadMonitor:WRN Job"

Also with a very long time such as:

2020-May-03 17:02:03.925339628 UTC LoadMonitor:WRN Job: processLedgerData run: 0ms wait: 337895ms

I mean...337 seconds to process the Ledger data?? :mellow:

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×
×
  • Create New...