Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Note that for any fine-tuned models (like GPT-4, where the foundation model has not been made accessible) the model does no longer give the "probabilities" of the next tokens, but rather their "goodness". Where the numbers say how good a token would be relative to the aims the model inferred from its fine-tuning.


Isn’t that the same thing? The non-fine-tuned models also have assumptions based on corpus and training. I don’t think there’s such a thing as a purely objective probability of the next token.


It's very different. We don't know exactly what the model consideres good after fine-tuning (which can lead to surprising cases of misalignment), while the probability that something is the next token in the training distribution is very clear. I don't know how they measure it, but they can apparently measure the "loss" which (I think) says how close the model is to some sort of real probability.


What I meant was, fine tuning is not substantially different from training. It seems odd to use different words for the resulting systems.


But fine-tuning is very different from (pre)training. Pretreating proceeds via unsupervised learning on massive amounts of data and compute, while fine-tuning uses much smaller amounts, with supervised learning (instruction tuning) and reinforcement learning (RLHF, constitutional AI).


"no longer"??

The deep learning models (of which LLMs and GPTs are a type) have never returned probabilities. Ever. Why do people have that hallucination suddenly?


They do produce probabilities at the end of generator, And they do select a single token for output. With highest probability or somehow randomized.

So, end users see only one value. But with access to internals all high value variants can be considered. The easy way to do it is to select one, save the state. Look forward and roll back to saved state. Try another token. Select the best output. The smart way is to do it only at key points, where it matters the most. Selecting those points is a different task. May be another model.


The probabilities (in form of log odds) can be directly accessed in the OpenAI playground, I believe. The "try again" approach would only work for temperature = 0, when the model always returns the tokens with the given probabilities. For temperature = 1 it always returns the token with the highest probability. Usually they use something like temperature 0.8 in ChatGPT, I think, which still biases the model toward the more likely tokens. In the playground the temperature can be set manually. (Again, for fine-tuned models, which are the majority, those are numbers are not probabilities but "goodnesses".)


Okay why is this downvoted? wtf


upvoting a bit. my guess we have here anti-AI vigilantes. Actually it's not a guess anymore, and not something new in general.


You can literally fire up the openai playground and ask gpt3 to give you all alternate token probability




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: