Jump to content
retryW

Down the xRabbit Hole

Recommended Posts

Great Job. And thanks for uploading some of your queries to pastebin. I still haven't got around to trying bigquery - but having examples to use might be an incentive to get started.

Quote

If you've made it this far, I thank you for having the patience to make it through this horrible mess that is my train of thought during research.

You have my sympathies. It is tedious work going through the transactions and looking for anything interesting.

Couple of questions : Do you know if it is possible to script queries using python in bigquery? I am unfortunately a C++ programmer by trade and have some limited python skills, but not really done any database work, so I find SQL to be somewhat painful. Can one extract subsets of data from the database and import them into pandas for example and then use python to do plotting and analysis? (either locally or on the google servers) - I'm very much interested in the overall xRapid volume (not so much in the individual transactions), so being able to extract interesting data and download it locally would be handy. Any tutorial for this kind of thing would be helpful if you discovered any good ones during your learning phase.

Thanks for showing us your results. Still very early in the game and not so much interesting statistics to see yet, but hopefully soon...

 

Share this post


Link to post
Share on other sites
Posted (edited)
4 hours ago, jbjnr said:

Great Job. And thanks for uploading some of your queries to pastebin. I still haven't got around to trying bigquery - but having examples to use might be an incentive to get started.

Thanks for showing us your results. Still very early in the game and not so much interesting statistics to see yet, but hopefully soon...

 

Thanks, I very much appreciate it!  And you're most welcome; I thought it would be good to share the queries I used so that others would have a place to start, as well as to backup my data so you knew it wasn't out of thin air.

4 hours ago, jbjnr said:

Couple of questions : Do you know if it is possible to script queries using python in bigquery? I am unfortunately a C++ programmer by trade and have some limited python skills, but not really done any database work, so I find SQL to be somewhat painful. Can one extract subsets of data from the database and import them into pandas for example and then use python to do plotting and analysis? (either locally or on the google servers) - I'm very much interested in the overall xRapid volume (not so much in the individual transactions), so being able to extract interesting data and download it locally would be handy. Any tutorial for this kind of thing would be helpful if you discovered any good ones during your learning phase.

- No, you can't use python to query it.  It's essentially just a big SQL database but with very optimized/machine learned searching abilities.

- EDIT: Apparently you can use a range of languages through the client library (thanks @Swiss):

https://cloud.google.com/bigquery/docs/reference/libraries

- I'm not too sure what a panda is (well the animal, yes xD),  but I could supply the query you need for basic overall volume of a corridor. There is a button to export results to CSV/XLSX.

- I can create a tutorial on basic queries, as well as the basics of the Data Explorer which essentially lets you make various graphs based on the queried data. (Although I'm still fairly new to it all)

 

As stated in the blog post, BigQuery is surprisingly expensive.  It goes off amount of data scanned, and as the XRPL data is quite large, so each query was scanning either 68Gb or 130Gb of data.  It cost me approximately $200 AUD  throughout the course of that research (I'd accidentally already setup billing for the free server hosting trial).

There is a free quota per month, but I forget what it is.

 

Edited by retryW

Share this post


Link to post
Share on other sites
3 hours ago, retryW said:

Thanks, I very much appreciate it!  And you're most welcome; I thought it would be good to share the queries I used so that others would have a place to start, as well as to backup my data so you knew it wasn't out of thin air.

For those interested in analyzing the top transacting wallets, I wrote the following query couple months ago:

WITH
  payments AS (
  SELECT
    t.Account AS Sender,
    t.Destination AS Receiver,
    COUNT(*) AS Count,
    MIN(COALESCE(t.DeliveredAmount, t.AmountXRP)) / 1000000 AS SmallestPayment,
    AVG(COALESCE(t.DeliveredAmount, t.AmountXRP)) / 1000000 AS AveragePayment,
    MAX(COALESCE(t.DeliveredAmount, t.AmountXRP)) / 1000000 AS HighestPayment,
    SUM(COALESCE(t.DeliveredAmount, t.AmountXRP)) / 1000000 AS SumPayments,
    MIN(t.LedgerIndex) as MinLedgerIndex,
    MAX(t.LedgerIndex) as MaxLedgerIndex
  FROM
    xrpledgerdata.fullhistory.transactions as t
  WHERE
    TransactionType = "Payment"
    AND TransactionResult = "tesSUCCESS"
    AND AmountXRP IS NOT NULL
    AND Destination IS NOT NULL
    AND Destination != Account
  GROUP BY
    Account, Destination)
SELECT
  p.Sender,p.Receiver,p.Count,p.SmallestPayment,p.AveragePayment,p.HighestPayment,p.SumPayments,lmin.closeTime as MinTime,lmax.closeTime as MaxTime,lmin.LedgerIndex as MinLedger,lmax.LedgerIndex as MaxLedger
FROM
  payments as p
  INNER JOIN xrpledgerdata.fullhistory.ledgers as lmin ON p.MinLedgerIndex = lmin.LedgerIndex
  INNER JOIN xrpledgerdata.fullhistory.ledgers as lmax ON p.MaxLedgerIndex = lmax.LedgerIndex
WHERE AveragePayment >= 1000 and Count >= 1000
ORDER BY Count desc

Feel free to adjust to your needs. Default WHERE clause includes all wallets that had in minimum 1000 transactions with an average payment amount >= 1000  XRP.

It will output a table like this:

TopWallets.thumb.png.9f484d3eacc5e7b1f4efd6ffcccf3267.png

Share this post


Link to post
Share on other sites
3 hours ago, retryW said:

Thanks, I very much appreciate it!  And you're most welcome; I thought it would be good to share the queries I used so that others would have a place to start, as well as to backup my data so you knew it wasn't out of thin air.

- No, you can't use python to query it.  It's essentially just a big SQL database but with very optimized/machine learned searching abilities.

 - I'm not too sure what a panda is (well the animal, yes xD),  but I could supply the query you need for basic overall volume of a corridor. There is a button to export results to CSV/XLSX.

- I can create a tutorial on basic queries, as well as the basics of the Data Explorer which essentially lets you make various graphs based on the queried data. (Although I'm still fairly new to it all)

As stated in the blog post, BigQuery is surprisingly expensive.  It goes off amount of data scanned, and as the XRPL data is quite large, so each query was scanning either 68Gb or 130Gb of data.  It cost me approximately $200 AUD  throughout the course of that research (I'd accidentally already setup billing for the free server hosting trial).

I'll spend a bit of time having a look at pages like this in a few weeks when I might have time https://solvewithdata.com/knowledge-base/using-google-bigquery-api-to-load-bigquery-tables-into-pandas-dataframes/ it seems to be possible to get python going with it, but it may take some experimentation for someone who's  not familiar with the toolchains.

Is Bigquery expensive because you are debugging a lot of searches and rerunning them again and again as you narrow down what you're looking for? or is it just expensive anyway and you really tried to not use too many cpu hours and download BW etc. My own testing with the ledger queries using the ripple API was shockingly wasteful and I would frequently just delete a massive dataset and re-download stuff when I wanted it. (Now that the scripts are a bit more stable I don't do that so much).

If you have time to make a tutorial, then please do, (but don't waste time just on my account).

 

Share this post


Link to post
Share on other sites
4 minutes ago, jbjnr said:

I'll spend a bit of time having a look at pages like this in a few weeks when I might have time https://solvewithdata.com/knowledge-base/using-google-bigquery-api-to-load-bigquery-tables-into-pandas-dataframes/ it seems to be possible to get python going with it, but it may take some experimentation for someone who's  not familiar with the toolchains.

You can use the BigQuery client library, which is available in several languages:

https://cloud.google.com/bigquery/docs/reference/libraries

Share this post


Link to post
Share on other sites

Great work.

Just to correct a typo in the bittrex to bitso analysis you seem to of misspelt continual: "with no recent continualt activity", also for the hunt "A transaction for just over 1,200 XRP" when it's actually referencing 12,000 xrp so lacking a zero.

Share this post


Link to post
Share on other sites
1 hour ago, scoobysi said:

Great work.

Just to correct a typo in the bittrex to bitso analysis you seem to of misspelt continual: "with no recent continualt activity", also for the hunt "A transaction for just over 1,200 XRP" when it's actually referencing 12,000 xrp so lacking a zero.

Thanks!  Will correct them  :)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×
×
  • Create New...