Story points as described represent an attempt to abstract estimation away from “amount of stuff done per unit time”, because we’re bad at doing that and people were traditionally using that to make us look bad. So we introduce an intermediate value, flip a ratio, and get the story point: the [meaningless value] per amount of stuff. Then we also get the velocity, which is the [meaningless value] per unit time, and…
…and we’re back where we started. Except we’re not, we’re slower than we were before, because it used to be that people asked “how much do you think you can get done over the next couple of weeks”, and we’d tell them, and we’d be wrong. But now they ask “how big is this stuff”, then they ask “how much capacity for stuff is there over the next couple of weeks”, and we tell them both of those things, and we get both wrong, so we still have a wrong answer to the original question but answered two distinct questions incorrectly to get there.
There’s no real way out of that. The idea that velocity will converge over time is flawed, both because the team, the context, and the problem are all changing at once, and because the problem with estimation is not that we’re Gaussian bad at it, but that we’re optimistic bad at it. Consistently, monotonically, “oh I think this will just mean editing some config, call it a one-pointer”-ingly, we fail to see complexity way more than we fail to see simplicity. The idea that even if velocity did converge over time, we would then have reliable tools for planning and estimation is flawed, because what people want is not convergence but growth.
Give people 40 points per sprint for 20 sprints and you’ll be asked not how you became so great at estimation, but why your people aren’t getting any better. Give them about 40 points per sprint for 20 sprints, and they’ll applaud the 44s and frown at the 36s.
The assumption that goes into agile, lean, kanban, lean startup, and similar ideas is that you’re already doing well enough that you only need to worry about local optima, so you may as well take out a load of planning overhead and chase those optima without working out your three-sprint rolling average local optimisation rate.