Valuation · 7 min · 2026-06-14
Why I picked volume-weighted median over a simple average
The fair-value oracle has to turn five disagreeing venue prices into one number. Here is the algorithm I chose, and the ones I rejected.
At any given second, the same CS2 skin has at least five different prices. Steam Market says one thing, Skinport says another, DMarket, CSFloat, and Buff163 each have their own. None of them is wrong, exactly. They are just five separate markets with different fees, different liquidity, and different buyers. The trouble is that fy_nance has to print a single number on your portfolio and call it fair value, and that number ends up in your tax math. So the question I had to answer early was a narrow one: when five venues disagree, what is the one number I am willing to defend?
This is the decision I made, why I made it, and the cleverer options I deliberately did not ship.
The number has to be hard to game
The first instinct is to just pick a venue. Use Steam, or use Buff163, and be done. I rejected that fast. Tying fair value to one venue means one venue's quirks become your portfolio's quirks. Steam's prices are inflated by a walled-garden wallet you cannot cash out. Buff163 runs in yuan with its own liquidity dynamics. Picking any single source bakes that source's distortion straight into your number, and worse, it means whoever can move that one venue can move your stated net worth.
The second instinct is a simple average across all five. Add the prices, divide by five. This feels fair and democratic, and it is a trap. Averages are not robust to outliers, and skin markets are full of outliers. A single thin listing, one seller who priced a knife at double market because they do not actually want to sell, drags the mean upward. One panic seller way below market drags it down. With only a handful of data points, a single weird listing has enormous leverage over the average. Skin listings are weird constantly, so an average would have your value flickering on noise.
The third instinct is "lowest ask," the floor price. This is the most gameable of all. The cheapest listing is exactly where the bait, the scams, and the mispriced-float items live. A float that is technically Field-Tested but sits at the ugly end of the range can list far below the real market for that wear and tell you nothing about what your item is worth. The floor is a number anyone can push down with one listing they never intend to honor. I was not going to anchor real money to it.
What I shipped: a volume-weighted median
The v1 oracle takes the lowest live ask from each venue, then combines them with two defenses stacked on top of each other.
First, weighting. Each venue's ask is not one equal vote. It is weighted by that venue's liquidity, which I compute from live order-book depth against 30-day traded volume. A deep, liquid venue with real two-sided activity counts heavily. A thin venue with one stale listing counts for almost nothing. So the venues where price discovery is actually happening drive the result, and the sleepy corners cannot.
Second, outlier rejection. Before I combine anything, I throw out any ask that sits more than two standard deviations from the mean of the set. That kills the obvious bait listing and the fat-finger price before it ever touches the calculation.
Then I take the weighted median of what survives. Median, not mean, because the median does not care how far away an outlier is, only that it is on one side. Weighted, so liquidity still decides who is near the center. The result is a number that a single listing cannot move and a single venue cannot capture.
A worked example
Say four venues quote the lowest ask on the same skin:
| Venue | Lowest ask | 30-day volume | Weight |
|---|---|---|---|
| Venue A | 4568 | 240 | high |
| Venue B | 4350 | 180 | high |
| Venue C | 4999 | 95 | medium |
| Venue D | 8200 | 1 | tiny |
Venue D is the problem. One listing at 8200, on a venue with a 30-day volume of exactly one. That is not a market, that is a person hoping.
Step one, outlier rejection. The mean of the four asks is about 5529, and the spread is wide because of D. Run the two-sigma test and 8200 falls outside the band, so it is rejected outright. It never gets a vote. Notice D would have been crushed anyway by its near-zero weight, but rejecting it first means it cannot even nudge the standard deviation for the remaining math.
Step two, the weighted median of A, B, and C. With A and B carrying heavy liquidity weights and C lighter, the center of mass sits right in the 4350 to 4568 band, landing around 4500. That is exactly where the two deep, actively traded venues already agree. The lonely 8200 contributed nothing, and the medium-liquidity 4999 pulled only gently. The number lands where the real trading is.
Compare that to the naive average of all four: 5529. The simple average is off by more than a thousand on a single fake listing. That gap is the entire argument.
The clever versions I deferred
I want to be honest that I considered fancier models and chose not to ship them yet.
Liquidity-adjusted pricing. Instead of only weighting venues by liquidity, actually discount the value of an item that only trades in thin markets and add a premium for one that trades deep. There is a real economic case here. An item you can only sell slowly is genuinely worth less than the headline number.
Time-decay weighting. Weight recent sales more heavily than older ones, so the value tracks momentum instead of treating a sale from three weeks ago the same as one from this morning.
Both are better models in theory. I deferred both for the same reason: I cannot validate them yet. Each one needs a real recorded price history to backtest against, and on day one I did not have one. I would rather ship something simple that I can defend line by line than something clever I cannot prove is right. A valuation tool that guesses elegantly and wrongly is worse than one that estimates plainly and honestly.
The good news is that the history is finally arriving. I now run a daily snapshot job that records every venue's asks, so the dataset to backtest the liquidity-adjusted and time-decay models is accumulating on its own. When I have enough of it to prove one of those models beats the weighted median on real data, I will promote it. Not before.
The principle underneath
There is one rule the whole oracle answers to: if a number cannot show its work, it does not ship. Fair value is not a black box you are asked to trust. When fy_nance prints a value, it can tell you which venues fed the calculation, what each one quoted, and which listings it rejected and why. You can see the 8200 sitting in the discard pile.
I would rather you understand a simple number than trust a clever one. That is the trade I made, and it is the one I will keep making until the data earns me the right to make a different one.