Election scheduler and prioritization revamp

I’ve added a branch: GitHub - nanocurrency/nano-node at election_scheduler, which is the start of an election scheduler component that will replace how elections are currently started. This will allow a separation between the decision of which elections to prioritize and the mechanics of tallying elections. Encapsulating prioritization this way makes the scheduling decision process easier because the site requesting the election to be started doesn’t need to evaluate all other scheduling considerations.

Elections are currently started in the node by the following sources:

  • Live blocks off the network
  • Confirmed dependents - when the previous block, and on receives the send block in the link field, are confirmed, the successor to that previous block can be activated if already in the ledger
  • Higher block difficulty block rescheduling - when higher work is published for an existing, unconfirmed block in the ledger which doesn’t already have an election
  • Vote observation hinting - when a % of the voting weight is seen for an unconfirmed block in the ledger which doesn’t already have an election
  • When requested through the block_confirm RPC command

With the new scheduler component in place, all these sources will feed to a central list of eligible elections so their prioritization can be more accurately compared across all elections. The aim of this approach is to provide better control over which elections are being started so one particular trigger doesn’t dominate the resources. The ultimate result should be better election alignment across the network.

This branch also contains the start of an election prioritization scheduler which will start elections based on a combination of the account balance and the last time the account had activity. Separating accounts into balance tiers prevents scheduling starvation of any tier by any other tier.

The foundation for this design is from the TaaC and PoS4QoS proposals by @Rob with some modifications to support malleable and untrusted timestamps. While signed timestamps give more mathematical rigor to the scheduling algorithm, it would require changing the signed block contents, coordinating network upgrades, and would take several months or more to implement which doesn’t fit with the current tactical needs.

The scheduler will start elections for the highest priority accounts in each balance tier in a round-robin fashion and the priority within each tier is least-recently-used order for the respective account. There will be 128 balance tiers, one for each bit in the balance field, and the leading 0s in the balance will determine the bucket.

Edit:
The scheduler can choose a bucket according to any probability function. The probability function is approximated by populating the schedule container with indices according to the desired bucket service frequency.

The new scheduler will consider the following:

  • Priority scheduled accounts
  • Confirmed dependents
  • Vote observation hinting
  • When requested through block_confirm
  • Fork resolution - This separation allows forks to be cleanly separated into their own low priority tier.

Of note, the new priority scheme has the potential to remove PoW requirements largely or entirely. This will make nano integrations far simpler than they currently are and sets nano apart from all the other systems that view fees as the only solution to the scheduling design.

As a reminder, this is a focused discussion topic. If your post gets flagged as off-topic, please ensure it is specific and technical in nature to this design.

43 Likes

In a theoretical situation where everyone happens to be publishing transactions with the same number of leading zeroes, and no other transactions exist in any other buckets, should the throughput for the network as a whole remain the same as if those were spread out across buckets?

4 Likes

How will unopened accounts (i.e. OPEN blocks) be treated (since they have never been used, they have no timestamp to compare with and they also have no balance)?

2 Likes

It would be the same as if it was from different buckets. Many of the buckets will never be used, i.e. the 127-bit or 126-bit bucket.

As with the TaaC proposal, it would consider the source account's timestamp/balance as well otherwise an attacker could just string along new accounts to bypass the least-recently-used ordering.

2 Likes

Great stuff. Two questions:

  1. Will third party apps (ie wallets) need to change their implementation to be compatible with this change? Is it a breaking change?

  2. Will this allow for a mixed PoW requirements where some tiers may require PoW and others don't instead of just having to choose PoW to be applied to all transactions or none at all?

1 Like

I think the lowest tiers are too easy to attack with spam, unless PoW or some other deterrent remains. I did some napkin math in the protocol discord, which I'll repeat here. Let's say a spammer wants to greatly reduce quality of service to all accounts of 0.001 or less. They could do this by creating 100,000 accounts with balances at the bottom of each tier, constantly sending from each. How much would it cost them to do this? 0.001 Nano is in the 2^89 to 2^90-1 tier, so here's some code:
y = 0
for (x in 1:89) {
y = y + 2^x * 100000
}
y / 10^30
Yields 123.794 Nano, currently just 550 USD. That's how much it would cost to fund 8,900,000 accounts covering all tiers at 2^89 and below. Accounts transacting in those tiers, if the network were confirming at reduced bandwidth and therefore around 20 CPS, would be able to transact approximately once per 5 days. And without PoW, an attacker could do this indefinitely while also bloating the ledger.

Please check my math and understanding of the tiers and priority system.

6 Likes

TaaC and P4Q is the foundation. I am sure the final draft will have considered those sorts of vectors as well.

I think you missed the second part of this paragraph. The priority within each tier is LRU which means (if I understood correctly) people who do less frequent transactions will have higher priority than people who do more frequently (eg spammer). So if there is a tie between an individual who does 1 tps and spammer with 5 tps, then the individual will have more priority. Again, I could be understanding this wrong.

Edit: OK I was thinking of LFU not LRU. Wouldn't LFU be better in this case?

That is how I understood it as well. It's LRU by account, so if there are 100,000 accounts sending transactions in a single tier, and Account A sends a transaction, it will get through quickly. However, they will then be the most recently used account, and their next transaction will have to wait for all other transactions to clear.

3 Likes

I'm still working on an upgrade path for this but as far as I see so far this is backwards compatible.

Ideally we want to get rid of the PoW requirement all together. If it were to be mixed the LRU buckets would come first and then the PoW difficulty. The disadvantage of continuing with PoW is the implementation complexity and the code complexity to merge the two concepts.

4 Likes

If we remove PoW entirely, won't we be constantly exposed to saturation attacks? If there is no cost to create a transaction, even a 1 raw transaction. Won't spammers always saturate the network which will put more stress on nodes than necessary?

What I was going to suggest on the initial message was to use the PoW multiplier as a way to jump buckets. So, if a transaction is on bucket 97, and multiplier is 5.2, it moves to bucket 102.

I haven't thought that much on this idea. But would this be "bad"?

3 Likes

I think there are some other things that when separately addressed will make this issue go away.

The bandwidth limit defines both the network CPS and also the ledger growth rate. This model does assume, especially if PoW is removed, that the network will run at full occupancy limited by the aggregate PR bandwidth limits. The PRs can decide where they want to see confirmation rate in relation to ledger growth.

Right now each additional account affects the way the ledger and synchronization processing is handled but this is just an implementation detail. Aside from per-account pruning there's no specific reason why a new account should have any more impact than a new transaction.

The last thing is the same issue any blockchain has, how do we deal with dust amounts in perpetuity as the ledger continues forever. Setting the cps rate low can mitigate this for a while i.e. the keep-blocks-small argument in BitCoin. In the immediate term this can be controlled by the PR bandwidth limits that control ledger growth while we investigate secondary storage or dust removal proposals.

1 Like

If PoW is removed the network will likely always be saturated up to the bandwidth limits, yes. If adopted, the bandwidth limit caps used by PRs will be the way the network controls cps / ledger growth in a decentralized way.

The issue with saturation right now is an implementation detail where it is not throttling or coordinating correctly. This scheduling algorithm will help both those issues where each PR will prioritize elections in mostly the same way.

3 Likes

Does this then push the saturation target back to block publishing and propagation? Currently voting is a large part of the bandwidth, but without PoW and the need to gossip the blocks so all PRs have them for proper election evaluation and alignment remaining, it seems a spammer would then turn to just overwhelming the network with blocks, regardless of whether they will be confirmed.

This seems to lean in favor of keeping PoW or perhaps adjust the way blocks are gossiped on the network.

2 Likes

Wouldn't an LFU be better suited for this? If tiers are tied, a person whose account isn't used frequently would have priority over someone who is constantly sending transactions.

I'm interested in knowing more about what will make this issue go away.

I also think that, in actual use, only a few tiers will contain nearly all legitimate transactions. In this case, the tiny tiers being heavily utilized would also be effectively draining network CPS. We should then seek to have as few 'easily-spammable' tiers as possible in order to increase the QoS for all legitimate tiers. For instance, I might argue transactions below 0.001 Nano are currently less likely to be legitimate use of Nano as a currency, and yet those transactions would be allocated up to 3/4 of all CPS, given all tiers are being utilized.

2 Likes

I think a policy of forward if locally prioritized otherwise drop does the trick.

For instance, if the 1-tier is full, any more recently used account in that tier would be dropped by everyone until occupancy is available.

With LFU it seems someone could make millions of say 1-nano accounts, let them sit for months and then start activating them and that entire tier would be inaccessible by anyone with more than 1 transaction until the dormant accounts have executed more transactions.

It's also harder to generate a frequency number than a most-recently used. Bootstrapping would also be difficult for the same reason.

1 Like

What if LFU uses the own node timestamp, and uses an average of say "last 5 minutes" for example, to determine the frequency?

1 Like