Rietenpetardat: Difference between revisions

small update
(small update)
Line 6: Line 6:
| prefix = Mention
| prefix = Mention
| owner = [[User:Edward|Edward]]
| owner = [[User:Edward|Edward]]
| language = C++, JavaScript
| language = Python
| sourcecode = https://github.com/Edward205/markov-discord-bot (old version)
| sourcecode = https://github.com/Edward205/markov-discord-bot (old version)
| invitelink = Not available to public
| invitelink = Not available to public
Line 16: Line 16:


== Commands ==
== Commands ==
The following refers to the latest version of the rietenpetardat as of 11th of August 2023.  
The following refers to the latest version of the rietenpetardat as of 2nd of October 2023.  


* '''/intreaba''' <question> - Query a language model instructed to answer questions
* '''/gptconfig''' - Prints the current configuration of the GPT model
* '''/gptconfig''' - Prints the current configuration of the GPT model
* '''/identitate''' <identitate> - Sets the identity to the paramater (only for moderators)
* '''/identitate''' <identity> - Sets the identity to the paramater (only for moderators)
* '''/parametru''' - Sets a paramater from the list below (only for moderators)
* '''/parametru''' - Sets a paramater from the list below (only for moderators)
** '''modgen''' - Selects the way which a message is generated
** '''modgen''' - Selects the way which a message is generated
Line 71: Line 72:
Chatbots are complex programs, requiring a high degree of analysis and processing of text data. One approach that would eliminate our dependence on an external service and generate coherent conversations would be to train an LLM with just the OKPR chat data. This could be trained from scratch or by fine-tuning.  
Chatbots are complex programs, requiring a high degree of analysis and processing of text data. One approach that would eliminate our dependence on an external service and generate coherent conversations would be to train an LLM with just the OKPR chat data. This could be trained from scratch or by fine-tuning.  


All generation-related code, with the exception of GPT-3 and ChatGPT, was written in C++ so as to be as fast as possible and linked to [https://discord.js.org/ discord.js] via the stdin and stdout interface.
All generation-related code until The Return, with the exception of GPT-3 and ChatGPT, was written in C++ so as to be as fast as possible and linked to [https://discord.js.org/ discord.js] via the stdin and stdout interface.


The training data for the algorithms or LLMs is an archive of text messages from the #bucuresti-general channel on OKPR v2, taken one day before it was deleted (see [[OkPrietenRetardat/Timeline|Timeline]]). Additionally, another archive of the #bucuresti-general channel from OKPR v3 was taken which will also be used. These archives are not publicly available.
The training data for the algorithms or LLMs is an archive of text messages from the #bucuresti-general channel on OKPR v2, taken one day before it was deleted (see [[OkPrietenRetardat/Timeline|Timeline]]). Additionally, another archive of the #bucuresti-general channel from OKPR v3 was taken which will also be used. These archives are not publicly available.
115

edits