You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
questions about the model, from ignorance.
Is there a way to set an extension of the created text?
It has occasionally happened to me with an answer that seems to be halfway,
I ask without having any idea, how this particular model works, if I saw in others that they usually add other types of parameters, of response extension, here it only occurs to me to change the number of tokens?
temperature=0.2,
top_k=50,
top_p=0.9,
repetition_penalty=1.0,
max_new_tokens=512, #adjust as needed
seeds=42,
reset=True, # reset history (cache)
stream=True, # streaming per word/token
threads=int(os.cpu_count() / 6), # adjust for your CPU
stop=["<|endoftext|>"],
Assuming that I wanted to take full advantage of the hardware, I ask why it spends so few resources that I don't know if it's using the gpu or the cpu, (I love that) although I'm curious what the limit may be, as far as it creates something.
I use a 12gb rtx2060, with 32gb of ram, on a ryzen3600x
Is there a way to use the gpu if it is not being used?
Is there a way to save what is being generated in a prompt log? and the response, such as query0001.txt
Is there a way to paste, for example, a code already made in the input?
I have tried to copy something to compare results, with things that I ask SAGE for example
but I fragmented the paste, in different lines, with which the result was according to each line. and lacked a meaning
Thank you very much in advance if you can answer my questions,
The text was updated successfully, but these errors were encountered:
questions about the model, from ignorance.
Is there a way to set an extension of the created text?
It has occasionally happened to me with an answer that seems to be halfway,
I ask without having any idea, how this particular model works, if I saw in others that they usually add other types of parameters, of response extension, here it only occurs to me to change the number of tokens?
I also saw that they mentioned in a video, a 10gb model
https://huggingface.co/replit/replit-code-v1-3b/tree/main
is it possible to use it? is it better, is it the same? Is it worse? Will it work if I lower it?
Assuming that I wanted to take full advantage of the hardware, I ask why it spends so few resources that I don't know if it's using the gpu or the cpu, (I love that) although I'm curious what the limit may be, as far as it creates something.
I use a 12gb rtx2060, with 32gb of ram, on a ryzen3600x
Is there a way to use the gpu if it is not being used?
Is there a way to save what is being generated in a prompt log? and the response, such as query0001.txt
Is there a way to paste, for example, a code already made in the input?
I have tried to copy something to compare results, with things that I ask SAGE for example
but I fragmented the paste, in different lines, with which the result was according to each line. and lacked a meaning
Thank you very much in advance if you can answer my questions,
The text was updated successfully, but these errors were encountered: