Jump to content

the network is down ?


Recommended Posts

22 hours ago, Sukrim said:

but in general just by looking at ledger headers there's no way someone can find out after the fact that for some time some ledgers were only partially validated. Maybe this is different in newer versions of rippled though.

Isn't there a timestamp? Probably I don't get how the close_time is calculated.

Edited by tulo
Link to post
Share on other sites
  • Replies 66
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Popular Posts

The data that Galgitron is showing would not represent the issue: he's post-processing ledgers based on the close_time field. Here are the facts: sometime around midnight (UTC time!) on 2018-11-1

Nothing casts more doubt that uncertainty and secrets. We all benefit when issues are discussed, analyzed and understood because the net result is a better, stronger network.

Good question. I wish I had a better answer for you, but it boils down to this: The company was using JIRA for everything else, and management wanted to be able to monitor the work we were doing

Posted Images

5 hours ago, tulo said:

So why the JIRA was made private? :rolleyes:

Good question. I wish I had a better answer for you, but it boils down to this:

The company was using JIRA for everything else, and management wanted to be able to monitor the work we were doing on JIRA instead of having to rely on a separate thing, so we ended up migrating away from GitHub and onto JIRA.

The original JIRA tracker was public, and the company maintained a private JIRA for the other closed-source software being developed. But having two trackers, one public and one private caused both confusion and managerial nightmares. Also, people were concerned about security and accidental leaks by posting something private on the public instance, and so the public instance was deprecated and, eventually, shut down.

Nobody from our team really liked JIRA to be honest—it’s both overly complex and very rigid, and the UX is just not that great.

In an ideal world, we would have continued using GitHub, so that issues could be created and tracked publicly, but we were told to use JIRA and (a) there were no really good tools to do bidirectional syncing between JIRA and GitHub and manual syncing was a huge headache and (b) there seemed to be little interest in a public issue tracker.

Of course, you can argue that not having a public issue tracker reduced interest and that reduced interest was then used to justify not having to have a public issue tracker, and round and round we go…

It was a bad decision on our part, because an open-source project needs an open, publicly accessible issue tracker, not a limited-access non-public tracker.

I internally discussed migrating back to GitHub several times after I became team lead and everyone on the team was supportive, but it was a large and time-consuming undertaking and time was a scarce resource that, management felt, offered few if any tangible benefits. So it was effectively deprioritized several times.

Your point is well taken. In retrospect, the original migration from GitHub to JIRA was a mistake, as was the failure to quickly correct that mistake, by migrating back. It’s hard to argue that those mistakes haven’t hurt the project’s image in the OSS community. It’s also hard to argue that they didn’t discourage more active participation by the broader community.

It’s unfortunate, because our intentions (I’m taking the liberty of also speaking for my teammates here) was to foster participation and encourage developers outside our team from contributing. 

But we are human and like all humans we make mistakes, despite our best intentions.

It’s easy to look back and lay blame; hindsight is 20/20 and all that. I don’t want to do that but I think it’s important that we look back and learn from past mistakes.

So where are we today? We’ve migrated away from JIRA almost entirely. and are now tracking things on GitHub; the boards and the issue trackers are publicly accessible to everyone.

Thanks for raising this issue. Your criticism is fair and, since you offered it in the form of a question, I hope that this answer helps!

Link to post
Share on other sites
4 hours ago, nikb said:

Thanks for raising this issue. Your criticism is fair and, since you offered it in the form of a question, I hope that this answer helps!

Excellent reply again Nik.  Thanks for being so helpful with that.

May I ask a question that’s popped up lately...  a job the same as or similar to yours is open at Ripple,  are you still with Ripple?  Are you leaving?  
 

If that’s too personal I understand and apologise for raising...   but enquiring minds do want to know if you are willing to say.  :) 

Link to post
Share on other sites
1 hour ago, BillyOckham said:

Excellent reply again Nik.  Thanks for being so helpful with that.

May I ask a question that’s popped up lately...  a job the same as or similar to yours is open at Ripple,  are you still with Ripple?  Are you leaving?  
 

If that’s too personal I understand and apologise for raising...   but enquiring minds do want to know if you are willing to say.  :) 

It's not personal at all. Yes, I'm still with Ripple and still leading the C++ team. 

Link to post
Share on other sites
14 hours ago, tulo said:

Isn't there a timestamp? Probably I don't get how the close_time is calculated.

There is a timestamp, however it seems that the partial validations contained the then-current timestamp and there was no real way (or benefit) to rewrite all ledger headers after validations were no longer partial. Remember that ledger header hashes are also part of the ledger itself and they in turn can influence the ordering (and thus outcome) of transactions.

Link to post
Share on other sites
33 minutes ago, Sukrim said:

What about https://github.com/ripple/rippled-specs/ if we're on the topic of private information about XRPL?

That repo is a bit of an experiment for our team and is meant to hold our "work in progress" drafts for things we're brainstorming and/or that we aren't ready to propose or publish for everyone to comment. It contains no "private" information as such and there's nothing magical in there that contains "secret sauce" that only the Ripple team is privy to.

You may feel that's not being transparent, but I wouldn't agree; it's reasonable for our team to have a place where we can internally collaborate on specs or brainstorm on ideas before we are ready to publish something to the community.

With that said, I'm happy to look into making this repo public. If you want to see how the sausage is made, then by all means :lol:

Link to post
Share on other sites

So @nikb if you have time can you say if you are thinking that the near halt or whatever it should be called is now rectified and shouldn’t happen from that cause again?

Do you think that the recent foundation creation will have a positive effect on this type of monitoring and mitigation area of the XRPL?

Thanks for being engaged and spending your valuable time educating us here.  Much appreciated.

Link to post
Share on other sites
3 hours ago, BillyOckham said:

So @nikb if you have time can you say if you are thinking that the near halt or whatever it should be called is now rectified and shouldn’t happen from that cause again?

Do you think that the recent foundation creation will have a positive effect on this type of monitoring and mitigation area of the XRPL?

Thanks for being engaged and spending your valuable time educating us here.  Much appreciated.

The bug that caused this issue has been fixed (I linked some of the relevant commits earlier). It’s unlikely to occur again because of that bug (since it’s been fixed) but I can’t definitively say it won’t happen again: code is a living thing, and humans are imperfect, so maybe something was missed or some future change will introduce a similar bug.

I hope the Foundation will have a hugely positive impact but, ultimately, what it boils down to is people and engagement.

The strength of an open-source project is the community around it.

Link to post
Share on other sites
8 hours ago, nikb said:

That repo is a bit of an experiment for our team and is meant to hold our "work in progress" drafts for things we're brainstorming and/or that we aren't ready to propose or publish for everyone to comment.

I hope I'll find useful info there, because my NUC is still suffering :D

Link to post
Share on other sites
1 hour ago, tulo said:

I hope I'll find useful info there, because my NUC is still suffering :D

Please elaborate?

Also, to be very clear the rippled-specs repo contains no code: just drafts of documents we are preparing that are intended as specs but which we don't feel meet the high bar we set for ourselves. For example the "negative UNL" documentation that now lives in `rippled` repo, under docs/0001-negative-unl began life in the rippled-specs repo, where it was reviewed and refined by my colleagues, prior to being made public.

 

Link to post
Share on other sites
20 minutes ago, nikb said:

Please elaborate?

Also, to be very clear the rippled-specs repo contains no code: just drafts of documents we are preparing that are intended as specs but which we don't feel meet the high bar we set for ourselves. For example the "negative UNL" documentation that now lives in `rippled` repo, under docs/0001-negative-unl began life in the rippled-specs repo, where it was reviewed and refined by my colleagues, prior to being made public.

 

Oh my bad, I thought it was about hardware specs, maybe tests with different hardware running rippled.

BTW I was referring to this: 

 

Link to post
Share on other sites

@nikb in the spirit of what you have been saying about participating...  

 

I went to the link you provided earlier and saw the excellent live consensus graphic (at XRPL.ORG) and went a bit further to accidentally trip across this interesting ledger info saying there were missing unl validators in one particular ledger??

I tried to find out more,  but there is not (that I could find) a legend or explanation for the graphic.  I thought I would add the suggestion to:  “show a legend with explanations”  and it directed me here to XRPChat...  so I’m doing it now.  :) 
 

I think a legend would be useful or a on hover question mark for various elements in the graphic.  I don’t understand what it is saying here about the unl.  
 

 BTW I realise that you are head of the C++ team,  and not minister in charge of educating random passers by...  just thought here was a good spot to mention it since the link to XRPChat didn’t arrive in a suggestions thread or Topic or whatnot.

 

DF292CB7-42E3-48E3-AA95-E2F0C475FFCD.jpeg

Link to post
Share on other sites
2 hours ago, BillyOckham said:

@nikb in the spirit of what you have been saying about participating...  

 

I went to the link you provided earlier and saw the excellent live consensus graphic (at XRPL.ORG) and went a bit further to accidentally trip across this interesting ledger info saying there were missing unl validators in one particular ledger??

I tried to find out more,  but there is not (that I could find) a legend or explanation for the graphic.  I thought I would add the suggestion to:  “show a legend with explanations”  and it directed me here to XRPChat...  so I’m doing it now.  :) 
 

I think a legend would be useful or a on hover question mark for various elements in the graphic.  I don’t understand what it is saying here about the unl.  
 

 BTW I realise that you are head of the C++ team,  and not minister in charge of educating random passers by...  just thought here was a good spot to mention it since the link to XRPChat didn’t arrive in a suggestions thread or Topic or whatnot.

 

DF292CB7-42E3-48E3-AA95-E2F0C475FFCD.jpeg

livenet.xrpl.org tracks the network locally, so when you first load up the website it's possible that it will take a few ledgers for it to catch up. It's also possible that connectivity issues on your end could cause some validations to be missed or arrive late.

It looks good from my end.

 

Link to post
Share on other sites
On 10/13/2020 at 1:52 PM, nikb said:

If anyone was monitoring the validation stream they could have detected it in real-time.

@nikb I appreciate the thorough discussion on the topic both here and on Twitter. I do sense a lot of confusion around the incident still, even with @JoelKatz trying to clear the air. 

I’m curious whether a postmortem would be useful for others to read. I understand you and the team at Ripple are not responsible for the XRPL, but from an earlier comment it looks like you guys did fix an issue (as seen in the commit https://github.com/ripple/rippled/commit/bd2a38f5844ce824c02cce1ed97e9cf0cd04c019). I did go through it but a lot of changes are due to what looks like an ide auto-format adding a bunch of spaces on top of the actual changes that fixed this. 

From a very simplistic point of view, if you were monitoring for partial validations would that have allowed you to catch this faster and alert validators?  Even if you don’t know the specific cause, at least you’d see partial validations and know something is wrong. This seems like an easy enough tool to build, would a tool that lets you look and ensure no partial validations (or which validators are issuing partial validations)  be useful to catch something like this in the future?

This is by no means me suggesting the tool be built by Ripple, I would give it a shot if I can free up some time in my personal life, just wondering if such a tool would be useful

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.