1 Comment
User's avatar
Paul's avatar
1dEdited

thanks again

I assume, in this case

"If p(guess_i) < q(guess_i), Mq was "overconfident." The guess is still accepted, but only with probability p(guess_i) / q(guess_i). This probabilistic step is key to maintaining the original distribution.",

guess_i will be rejected if p(guess_i) / q(guess_i) < p(some_other_token_i) depending on the decoding strategy, e.g., greedy decoding, won't it?

Expand full comment