Editing Rietenpetardat

Jump to navigation Jump to search

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 1: Line 1:
{{Infobox bot
= rietenpetardat (Chatbot) =
| name = rietenpetardat
AKA Ortobebop
| image = Rietenpetardat bot picture.webp
| discordtag = Ortobebop#4964
| discordid = 1013881714243805254
| prefix = Mention
| owner = [[User:Edward|Edward]]
| language = Python
| sourcecode = https://github.com/Edward205/markov-discord-bot (old version)
| invitelink = Not available to public
}}


'''rietenpetardat''' (also known as Ortobebop) is a [[wikipedia:Chatbot|chatbot]] created by [[Edward]] for the [[OkPrietenRetardat|OKPR]] Discord server. Its purpose is to talk to server users in a similar way to a regular member of the community. Over time, it has received updates such as new commands and changes to the core architecture. It is intended for the OKPR community, as such it is not allowed to be added to other servers.
'''rietenpetardat''' is a [[wikipedia:Chatbot|chatbot]] created by [[User:Edward|Edward]] for the [[OkPrietenRetardat|OKPR]] Discord server. Its purpose is to talk to server users in a similar way to a regular member of the community. Over time, it has received updates such as new commands and changes to the core architecture. It is intended for the OKPR community, as such it is not allowed to be added to other servers.


Its profile picture was generated with Stable Diffusion by Razvan5576.
== Development ==
 
== Commands ==
The following refers to the latest version of the rietenpetardat as of 2nd of October 2023.
 
* '''/intreaba''' <question> - Query a language model instructed to answer questions
* '''/gptconfig''' - Prints the current configuration of the GPT model
* '''/identitate''' <identity> - Sets the identity to the paramater (only for moderators)
* '''/parametru''' - Sets a paramater from the list below (only for moderators)
** '''modgen''' - Selects the way which a message is generated
*** '''Organic''' - Text is generated until a response from the identity is generated
*** '''Forțat''' - A response belonging to the identity is appended to the next line, leaving the AI to forcefully generate a message from that identity
** '''temperatura'''
*** A floating point number indicating the temperature (randomness) of the generation
** '''top_k'''
*** The first top_k tokens, sorted in order of appearances
** '''seed'''
*** An integer number used for generating random numbers for the generation
* '''/reinitializare''' - Reinitialises the model. Necessary when changing some options (only for moderators)
* '''/resetmemorie''' - Clears the memory of past messages (only for moderators)
 
== Development history ==


=== Initial version ===
=== Initial version ===
Line 41: Line 11:
As per the GPL3 license under which the original IRC chatbot code is subject, the modified source code for rietenpetardat can be found here: https://github.com/Edward205/markov-discord-bot
As per the GPL3 license under which the original IRC chatbot code is subject, the modified source code for rietenpetardat can be found here: https://github.com/Edward205/markov-discord-bot


The first tests of this version were done starting with 29th of August 2022 and it was released on the OKPR Discord server the next day.
The first tests of this version were done starting with 29th of August 2022 and it as released on the OKPR Discord server the next day.


=== Addition of the copypasta command ===
=== Addition of the copypasta command ===
Line 62: Line 32:


=== Temporary shutdown ===
=== Temporary shutdown ===
On 24th of April, 2023 the GPT-3 chat functions and the ChatGPT command no longer work because Edward's payment method started rejecting payments from OpenAI, making the API inaccessible.
As of April 24, 2023 the GPT-3 chat functions and the ChatGPT command no longer work because Edward's payment method started rejecting payments from OpenAI, making the API inaccessible.
 
=== The return ===
Since the shutdown, Edward has been searching for a way to train an AI which can be fully self-hosted. An attempt has been made to train a language model using [https://marian-nmt.github.io/ marian-nmt], but it was not successful. On 18th of July he began experimenting with [https://github.com/karpathy/nanoGPT nanoGPT] and he was able to train a model from scratch on data from past chats from the OKPR Discord server. The next day, a bare-bones but operational version was finished and released for a few hours on the OKPR Discord server for testing.  


The Discord bot is now developed in Python using the [https://pycord.dev/ Pycord] library. Development testing of the bot is done on a private server.  
=== Future plans ===
Edward is planning on training a custom model as described below. An attempt has been made to train a language model using [https://marian-nmt.github.io/ marian-nmt], but this has not been successful. He is now planning to train a GPT-style model.


== Technical details and research ==
== Technical details and research ==
Chatbots are complex programs, requiring a high degree of analysis and processing of text data. One approach that would eliminate our dependence on an external service and generate coherent conversations would be to train an LLM with just the OKPR chat data. This could be trained from scratch or by fine-tuning.  
Chatbots are complex programs, requiring a high degree of analysis and processing of text data. One approach that would eliminate our dependence on an external service and generate coherent conversations would be to train an LLM with just the OKPR chat data. This could be trained from scratch or by fine-tuning.  


All generation-related code until The Return, with the exception of GPT-3 and ChatGPT, was written in C++ so as to be as fast as possible and linked to [https://discord.js.org/ discord.js] via the stdin and stdout interface.
All generation-related code, with the exception of GPT-3 and ChatGPT, was written in C++ so as to be as fast as possible and linked to [https://discord.js.org/ discord.js] via the stdin and stdout interface.


The training data for the algorithms or LLMs is an archive of text messages from the #bucuresti-general channel on OKPR v2, taken one day before it was deleted (see [[OkPrietenRetardat/Timeline|Timeline]]). Additionally, another archive of the #bucuresti-general channel from OKPR v3 was taken which will also be used. These archives are not publicly available.
The training data for the algorithms or LLMs is an archive of text messages from the #bucuresti-general channel on OKPR v2, taken one day before it was deleted (see [[OkPrietenRetardat/Timeline|Timeline]]). Additionally, another archive of the #bucuresti-general channel from OKPR v3 was taken which will also be used. These archives are not publicly available.
Please note that all contributions to Irony Wiki are considered to be released under the Creative Commons Attribution-ShareAlike (see Irony Wiki:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!
Cancel Editing help (opens in new window)