I would like to reopen the discussion on how blocks are prioritized within each bucket
Right now blocks are sorted into buckets based on balance. The node is then processing one block from each bucket in a round robin. This part works fine in my opinion
The blocks in each bucket is currently sorted by LRU (Least Recent usage). The idea is that accounts that rarely do any transactions gets a higher priority than account that transacts all the time.
This sorting is a sympathetic idea but it comes with a big drawback. If someone is spamming a bucket at a very rate, nodes will have a slightly different idea of which block has the highest LRU. This makes the nodes vote on different blocks and thus harder to reach consensus which again reduces transaction rate and the bucket can be filled further by the spammer
From the testing I've done it seems like we have no problem in processing spam at 175 CPS on current beta network. However, if we go just a bit faster than that, we will start building a backlog and CPS drops significantly. As backlog grows, cps drops even further. Watching a vote visualizer it is clear that nodes are no longer voting on the same blocks and if they don't reach consensus in a certain timeframe they start voting on another block. This creates a lot of wasted votes and bandwidth and not very efficient
I see two solutions:
Sort blocks alphabetically by hash. All nodes agrees on the alphabet so they will vote on the blocks in the same priority. It is easy to implement, but it does remove LRU prioritization. In the end I believe that all transactions will get confirmed faster this way than with the current implementation
A combination of LRU and alphabetic sorting. The idea is to bundle blocks into groups of LRU and then sort this group alphabetically. This way LRU will be intact but not as fine grained as before, but we will still get the speed improvements from sorting alphabetically
The original idea of sorting by hash was came from Rafster. All credits go to him
I'm no expert on this subject so please correct me if I've misunderstood something. English is also not my first language so I hope I explained it ok
Legitimate transactions will come, their previous block will have an older timestamp, they will be processed first.
What are you on about? No two accounts will have the same timestamp for their frontier blocks.
I think the issue arises when the AEC across nodes goes out of alignment during spam, then confirmations for all blocks (regardless of LRU) get impacted.
Ricky, has anyone tested confirmations for "legit transactions" during the tps slowdown you mention?
I see two possible beta tests to see if there really is an issue
- Occasional legit transactions during cps slowdown
- 200k+ spam from new accounts above the threshold cps (say 350bps) immediately followed by 100k+ spam below threshold (say 50bps) from 'old accounts' during cps slowdown.
Is there an easy way to track '2' to see if the older set work their way in front? Would be interesting to see if cps rises by 50 when set 2 starts.
Also, if I was an attacker... I'd sit on a bunch of accounts for 12 months then unleash hell - how does the network get around the patient attacker?
(another thought... Internal bucket alignment may not matter if an attacker sends thousands of different blocks directly to different nodes, they'll not see alignment no matter what the sorting...)
I agree that anything that does not get close to "AEC sync" (the active election container of each node contains the same list of blocks at any given time) will delay confirmations and waste network resources. The extreme example you mentioned - where LRU is very similar for all blocks and therefore sorting by LRU leads to a high level of randomness of which blocks each node is voting on - is not an unrealistic scenario under high network loads, even without an ongoing spam attack.
So either AEC sync can be achieved by a property of the block (like solutions 1 and 2 you mentioned) or it has to happen very efficiently and quickly afterwards via votes.
I propose using incoming votes to change the local prioritization. Let's say all incoming blocks within a bucket are very similar in LRU. Node A will have a more or less random selection of these blocks inside their AEC because it sees these blocks at slightly different times than other nodes. Similar situation for node B, just with different blocks inside their AEC. Then node A votes on the first block001 it has inside its AEC. Node B receives this vote but doesnt find the block inside their AEC. If vote_weight_nodeA is > vote_weight_nodeB, then node B will add the block to their AEC, removing a random block (coming from the same bucket) it hasn't voted on. This would not sync the AEC rightaway but it would allow for PRs with high voting weight to kind of impose their local LRU timings on smaller nodes while not wasting their votes. In turn "misaligned" blocks that a small node is voting on won't affect the content of AECs of larger nodes, so if they don't have them in their AEC because of their local LRU sorting, they would be left in the dust. This idea was partially inspired by PlasmaPower's proposal to use vote weight for local prioritization parameters (Discord).
This is a clever idea.
I wonder how resource heavy it would be.... Also, as the network becomes more decentralised then the effectiveness of this approach will drop leading to sync problems again (although you could tally total vote weight per block)
If memory serves me, I think we tested this in the past and found that "legit" transactions are prioritized. However, that is not really the issue I'm addressing. I'm more worried about how slow a large backlog is being processed and the wasted resources during this process
I'd love to do more testing on the beta network on this
About the "Sitting on a bunch of accounts for 12 months" : Yes, an attacker could do that but it makes it a bigger hassle to prepare the spam, and when all the account have sent one transaction they are can't be used for another 12 months. The attack is also less effective if transactions are confirmed at 275+ CPS instead of 5-10 CPS
I have tested legit transactions during the last spam on beta. I also tested the same extensively in private networks.
The prioritisation works to some extend but if the network is above saturation, confirnations of legit tend to take very long. (above 15s)
For the tests I did on local networks running V23.3 I set the timeout to 20s before sending the next legit transaction. I saw timeout rates between 5%-75%.
(got the results somewhere)
So I think there is a need to improve!
The biggest problem with the prioritization as it is is the RR approach to the buckets themselves. It's an implementation of the original TaaC proposal that lacks P4Q, but P4Q covers a weakness of TaaC: Sybil.
For 128 buckets every 2^n raw, what this means is that spamming equally across all buckets up to 2^100 will throttle the network by roughly 90%.
Your 15-20s confirmation times would drop to 1.5-2s if the only thing that changed was prioritizing higher buckets before lower buckets. It's the single biggest fix that could be applied and wouldn't even take a major overhaul.
The other problems in the thread (if I understood them at a glance) would be fixed by having timestamp on the transaction so it isn't based upon node receipt time, hence no desyncing.
Assuming dev team's end goal is TaaC/P4Q as originally discussed, or some close variant of it, both of these issues should actually be addressed in the ways described above.