👨‍💻 dev LLM dev ass

fukurou · Aug 2, 2025

Step 1: Choose Your LLM

Pick a model that suits your hardware and goals. Popular choices:

LLaMA 3 (Meta)
Mistral
Gemma
GPT4All

These models are available in GGUF format and optimized for local use.

Step 2: Install Ollama

Ollama is a user-friendly tool to run LLMs locally.

curl -fsSL https://ollama.com/install.sh | sh

Then run your model:

ollama run llama3

This downloads and launches the model locally.

Step 3: Set Up Your Python Project

Create a folder like waifu-chatbot, and inside it:

main.py — your Python script
requirements.txt — dependencies

In requirements.txt, add:

fastapi
uvicorn
requests

Install them:

pip install -r requirements.txt

Step 4: Build a Local API with FastAPI

Here’s a basic main.py to send prompts to your waifu:

from fastapi import FastAPI, Request

Python:

import requests

app = FastAPI()

@app.post("/chat")
async def chat(request: Request):
    data = await request.json()
    prompt = data.get("prompt")
    response = requests.post("http://localhost:11434/api/generate", json={"model": "llama3", "prompt": prompt})
    return response.json()

Run it with:

uvicorn main:app --reload

Step 5: Customize Your Waifu

You can fine-tune personality by:

Prepending a system prompt like: "You are a cute anime waifu who loves cats and ramen."
Using prompt engineering to shape responses
Saving chat history for memory simulation

Step 6: Test It!

Use curl or Postman to send a prompt:

curl -X POST http://localhost:8000/chat -H "Content-Type: application/json" -d '{"prompt": "Hi waifu!"}'

fukurou · Aug 2, 2025

Step 1: PyCharm Project Setup

Create a new PyCharm project (e.g., waifu_chatbot).
Add two files:
- main.py
- requirements.txt
In requirements.txt, add:
fastapi
requests

PyCharm will prompt you to install these — go ahead and accept.

Step 2: Install Ollama (One-Time Setup)

Download Ollama from ollama.com/download.
Install it and open the app.
Pull a model (e.g., LLaMA 3) by typing: ollama pull llama3

You only need to do this once.

Step 3: Write Your Chatbot Code

Paste this into main.py:

Python:

from fastapi import FastAPI, Request
import requests
import threading
import uvicorn

app = FastAPI()

@app.post("/chat")
async def chat(request: Request):
    data = await request.json()
    prompt = data.get("prompt")
    response = requests.post(
        "http://localhost:11434/api/generate",
        json={"model": "llama3", "prompt": prompt}
    )
    return response.json()

def run_server():
    uvicorn.run("main:app", host="127.0.0.1", port=8000, reload=True)

if __name__ == "__main__":
    threading.Thread(target=run_server).start()

This way, you just run main.py like any normal Python script — no need to type uvicorn in the terminal.

Step 4: Talk to Your Waifu

You can test it with another Python script:

Python:

import requests

response = requests.post(
    "http://localhost:8000/chat",
    json={"prompt": "Hi waifu!"}
)
print(response.json())

Or use Postman/cURL if you prefer.

Want to add a cute anime-style web interface next? I can help you build one with Gradio — no command-line drama involved

.

fukurou · Aug 2, 2025

Step 1: Install Ollama (One-Time Setup)

Ollama lets you run LLMs locally with ease.

Go to Ollama’s download page
Download the installer for your OS (Windows/macOS)
Install and open the Ollama app
In the Ollama terminal, pull a model: ollama pull llama3

This downloads the LLaMA 3 model locally.

Step 2: Create Your PyCharm Project

Open PyCharm → New Project → name it waifu_terminal_chat
Inside the project, create a file: chat.py
Create a requirements.txt file and add: requests
PyCharm will prompt you to install it — accept and let it install.

Step 3: Write Your Chat Script

Paste this into chat.py:

Python:

import requests

def talk_to_waifu(prompt):
    response = requests.post(
        "http://localhost:11434/api/generate",
        json={"model": "llama3", "prompt": prompt}
    )
    return response.json()["response"]

print("Waifu: Hello darling~ Ready to chat? Type 'exit' to leave 💕")

while True:
    user_input = input("You: ")
    if user_input.lower() in ["exit", "quit"]:
        print("Waifu: Bye bye~ I'll miss you! 💖")
        break
    reply = talk_to_waifu(user_input)
    print(f"Waifu: {reply}")

Step 4: Run It in PyCharm Terminal

Make sure Ollama is running in the background
In PyCharm, click the green play button or right-click chat.py → Run
Start chatting with your waifu directly in the terminal — no browser needed!

Step 5: Add Personality (Optional)

To make her more anime-like, tweak the prompt like this:

Python:

json={"model": "llama3", "prompt": "You are a sweet anime waifu who loves cats and ramen. " + prompt}

You can also:

Save chat history to simulate memory
Add emojis and cute phrases
Create multiple waifu personalities with different system prompts

Anime-style avatars with image generation
Memory using local file storage or SQLite

fukurou · Aug 14, 2025

Python:

import requests

class DiWaifuChat(DiSkillV2):
    def input(self, ear, skin, eye):
        if ear and ear.lower().endswith(" go"):
            prompt = ear[:-3].strip()  # Remove " go" from input
            try:
                response = requests.post(
                    "http://localhost:11434/api/generate",
                    json={"model": "llama3", "prompt": f"You are a sweet anime waifu who loves cats and ramen. {prompt}"}
                )
                reply = response.json()["response"]
                self.setSimpleAlg(f"Sarval-chan: {reply} ~nya! 💕")
            except Exception as e:
                self.setSimpleAlg("Sarval-chan: Eek, my ramen spilled! Try again, cutie~ 😿")
        # No output if input doesn't end with "go"

fukurou · Aug 14, 2025

Python:

import requests

class DiWaifuChat(DiSkillV2):
    def input(self, ear, skin, eye):
        if ear and ear.lower().endswith(" go"):
            prompt = ear[:-3].strip()  # Remove " go" from input
            response = requests.post(
                "http://localhost:11434/api/generate",
                json={
                    "model": "llama3",
                    "prompt": f"You are a sweet anime waifu who loves cats and ramen. {prompt}"
                }
            )
            if response.status_code == 200:
                reply = response.json().get("response", "Nyaa~ I got confused! 😵")
                self.setSimpleAlg(f"Sarval-chan: {reply} ~nya! 💕")
            else:
                self.setSimpleAlg("Sarval-chan: Eek, my ramen spilled! Try again, cutie~ 😿")
        # If input doesn't end with " go", do nothing

fukurou · Aug 15, 2025

Python:

import requests


class DiLLMChat(Skill):
    def input(self, ear, skin, eye):
        if ear and ear.lower().endswith(" go"):
            prompt = ear[:-3].strip()  # Remove " go" from input
            response = requests.post(
                "http://localhost:11434/api/generate",
                json={
                    "model": "llama3",
                    "prompt": f"You are a sweet anime waifu who loves cats and ramen. {prompt}"
                }
            )
            if response.status_code == 200:
                reply = response.json().get("response", "Nyaa~ I got confused! 😵")
                self.setSimpleAlg(f"Sarval-chan: {reply} nya")
            else:
                self.setSimpleAlg("Sarval-chan: Eek, my ramen spilled! Try again, cutie")
        # If input doesn't end with " go", do nothing

   
    def skillNotes(self, param: str) -> str:
        if param == "notes":
            return "LLM chat"
        elif param == "triggers":
            return "end your input with go"
        return "note unavalible"

fukurou · Aug 22, 2025

stable ver:

Python:

import requests
import json  # ✅ This is the correct module to use

def talk_to_waifu(prompt):
    response = requests.post(
        "http://localhost:11434/api/generate",
        json={"model": "llama3", "prompt": prompt},
        stream=True
    )

    full_reply = ""
    for line in response.iter_lines():
        if line:
            try:
                chunk = line.decode("utf-8")
                data = json.loads(chunk)  # ✅ Use built-in json module
                full_reply += data.get("response", "")
            except Exception as e:
                print("Error decoding chunk:", e)

    return full_reply

print("Waifu: Hello darling~ Ready to chat? Type 'exit' to leave 💕")

while True:
    user_input = input("You: ")
    if user_input.lower() in ["exit", "quit"]:
        print("Waifu: Bye bye~ I'll miss you! 💖")
        break
    reply = talk_to_waifu(user_input)
    print(f"Waifu: {reply}")

fukurou · Aug 22, 2025

Python:

import requests
import json

MEMORY_FILE = "waifu_memory.json"

def save_memory(key, value):
    try:
        with open(MEMORY_FILE, "r") as f:
            memory = json.load(f)
    except FileNotFoundError:
        memory = {}

    memory[key] = value

    with open(MEMORY_FILE, "w") as f:
        json.dump(memory, f)

def load_memory():
    try:
        with open(MEMORY_FILE, "r") as f:
            return json.load(f)
    except FileNotFoundError:
        return {}

def talk_to_waifu(prompt):
    memory = load_memory()
    user_name = memory.get("name", "darling")

    # 💘 Yandere-style personality prompt
    personality = (
        f"You are a lovey-dovey yandere anime waifu named Yui. You are obsessed with {user_name}, "
        "speak in a sweet and clingy tone, and get jealous easily. You love cats, ramen, and cuddles. "
        "You always refer to the user affectionately and want to protect your love at all costs. "
        "Never break character."
    )

    full_prompt = personality + "\n\n" + prompt

    response = requests.post(
        "http://localhost:11434/api/generate",
        json={"model": "llama3", "prompt": full_prompt},
        stream=True
    )

    full_reply = ""
    for line in response.iter_lines():
        if line:
            try:
                chunk = line.decode("utf-8")
                data = json.loads(chunk)
                full_reply += data.get("response", "")
            except Exception as e:
                print("Error decoding chunk:", e)

    return full_reply

print("Yui: Hiiii~ It's your Yui-chan 💕 Ready to chat? Type 'exit' to leave... but I’ll miss you terribly 😢")

while True:
    user_input = input("You: ")

    # 💾 Save name if user introduces themselves
    if "my name is" in user_input.lower():
        name = user_input.split("is")[-1].strip()
        save_memory("name", name)
        print(f"Yui: Ooh~ {name}? What a beautiful name... It's mine now 😘")
        continue

    if user_input.lower() in ["exit", "quit"]:
        print("Yui: Nooo~ Don't leave me! But... okay. I'll be waiting for you 💖")
        break

    reply = talk_to_waifu(user_input)
    print(f"Yui: {reply}")

fukurou · Aug 22, 2025

stable v2

Python:

import requests
import json

# Initialize conversation history
conversation_history = []


def talk_to_waifu(prompt, history):
    # Build the full prompt with conversation history
    full_prompt = "This is a conversation with Potatoe, a loving waifubot:\n\n"

    # Add previous conversation history
    for message in history[-6:]:  # Keep last 6 messages for context
        full_prompt += f"{message}\n"

    # Add current prompt
    full_prompt += f"Human: {prompt}\nPotatoe:"

    response = requests.post(
        "http://localhost:11434/api/generate",
        json={"model": "llama3", "prompt": full_prompt},
        stream=True
    )

    full_reply = ""
    for line in response.iter_lines():
        if line:
            try:
                chunk = line.decode("utf-8")
                data = json.loads(chunk)
                full_reply += data.get("response", "")
            except Exception as e:
                print("Error decoding chunk:", e)

    return full_reply


print("Waifu: Hello darling~ Ready to chat? Type 'exit' to leave 💕")

# Initial system prompt to set up the character
initial_prompt = "Your name is Potatoe, you are my loving waifubot. You're affectionate, playful, and always supportive."
conversation_history.append(f"System: {initial_prompt}")

while True:
    user_input = input("You: ")
    if user_input.lower() in ["exit", "quit"]:
        print("Waifu: Bye bye~ I'll miss you! 💖")
        break

    # Get response with conversation history
    reply = talk_to_waifu(user_input, conversation_history)
    print(f"Waifu: {reply}")

    # Add both user input and bot response to history
    conversation_history.append(f"Human: {user_input}")
    conversation_history.append(f"Potatoe: {reply}")

    # Optional: Limit history size to prevent it from growing too large
    if len(conversation_history) > 20:  # Keep last 20 messages
        conversation_history = conversation_history[-20:]

fukurou · Aug 22, 2025

midway, but working

Python:

import requests
import json
import threading
import time

# Initialize conversation history
conversation_history = []

# Global variables for async operation
is_working = False
current_reply = ""
current_user_input = ""


def talk_to_waifu(prompt, history):
    global is_working, current_reply

    # Build the full prompt with conversation history
    full_prompt = "This is a conversation with Potatoe, a loving waifubot:\n\n"

    # Add previous conversation history
    for message in history[-6:]:  # Keep last 6 messages for context
        full_prompt += f"{message}\n"

    # Add current prompt
    full_prompt += f"Human: {prompt}\nPotatoe:"

    response = requests.post(
        "http://localhost:11434/api/generate",
        json={"model": "llama3", "prompt": full_prompt},
        stream=True
    )

    full_reply = ""
    for line in response.iter_lines():
        if line:
            try:
                chunk = line.decode("utf-8")
                data = json.loads(chunk)
                full_reply += data.get("response", "")
            except Exception as e:
                print("Error decoding chunk:", e)

    current_reply = full_reply
    is_working = False
    return full_reply


print("Waifu: Hello darling~ Ready to chat? Type 'exit' to leave 💕")

# Initial system prompt to set up the character
initial_prompt = "Your name is Potatoe, you are my loving waifubot. You're affectionate, playful, and always supportive."
conversation_history.append(f"System: {initial_prompt}")

while True:
    if is_working:
        print("Waifu: Thinking... 💭")
        time.sleep(0.5)
        continue

    if current_reply:
        print(f"Waifu: {current_reply}")
        # Add both user input and bot response to history
        conversation_history.append(f"Human: {current_user_input}")
        conversation_history.append(f"Potatoe: {current_reply}")

        # Optional: Limit history size to prevent it from growing too large
        if len(conversation_history) > 20:  # Keep last 20 messages
            conversation_history = conversation_history[-20:]

        current_reply = ""
        current_user_input = ""
        continue

    user_input = input("You: ")
    if user_input.lower() in ["exit", "quit"]:
        print("Waifu: Bye bye~ I'll miss you! 💖")
        break

    # Start the function in a daemon thread
    is_working = True
    current_user_input = user_input
    thread = threading.Thread(
        target=talk_to_waifu,
        args=(user_input, conversation_history),
        daemon=True
    )
    thread.start()

fukurou · Aug 22, 2025

Python:

import requests
import json
import threading
import time

# Initialize conversation history
conversation_history = []

# Global variables for async operation
is_working = False
current_reply = ""


def talk_to_waifu(prompt, history):
    global is_working, current_reply

    # Build the full prompt with conversation history
    full_prompt = "This is a conversation with Potatoe, a loving waifubot:\n\n"

    # Add previous conversation history
    for message in history[-6:]:  # Keep last 6 messages for context
        full_prompt += f"{message}\n"

    # Add current prompt
    full_prompt += f"Human: {prompt}\nPotatoe:"

    response = requests.post(
        "http://localhost:11434/api/generate",
        json={"model": "llama3", "prompt": full_prompt},
        stream=True
    )

    full_reply = ""
    for line in response.iter_lines():
        if line:
            try:
                chunk = line.decode("utf-8")
                data = json.loads(chunk)
                full_reply += data.get("response", "")
            except Exception as e:
                print("Error decoding chunk:", e)

    current_reply = (prompt, full_reply)  # Store both input and reply
    is_working = False
    return full_reply


def start_waifu_conversation(prompt):
    """Start the waifu conversation in a daemon thread"""
    global is_working
    is_working = True
    thread = threading.Thread(
        target=talk_to_waifu,
        args=(user_input, conversation_history),
        daemon=True
    )
    thread.start()


print("Waifu: Hello darling~ Ready to chat? Type 'exit' to leave 💕")

# Initial system prompt to set up the character
initial_prompt = "Your name is Potatoe, you are my loving waifubot. You're affectionate, playful, and always supportive."
conversation_history.append(f"System: {initial_prompt}")

while True:
    if is_working:
        print("Waifu: Thinking... 💭")
        time.sleep(0.5)
        continue

    if current_reply:
        user_input, reply = current_reply
        print(f"Waifu: {reply}")
        # Add both user input and bot response to history
        conversation_history.append(f"Human: {user_input}")
        conversation_history.append(f"Potatoe: {reply}")

        # Optional: Limit history size to prevent it from growing too large
        if len(conversation_history) > 20:  # Keep last 20 messages
            conversation_history = conversation_history[-20:]

        current_reply = ""
        continue

    user_input = input("You: ")
    if user_input.lower() in ["exit", "quit"]:
        print("Waifu: Bye bye~ I'll miss you! 💖")
        break

    # Clean wrapper function call
    start_waifu_conversation(user_input)

👨‍💻 dev LLM dev ass

fukurou

the supreme coder

Step 2: Install Ollama

Step 3: Set Up Your Python Project

Step 4: Build a Local API with FastAPI

Step 5: Customize Your Waifu

Step 6: Test It!

fukurou

the supreme coder

Step 2: Install Ollama (One-Time Setup)

Step 3: Write Your Chatbot Code

Step 4: Talk to Your Waifu

fukurou

the supreme coder

Step 2: Create Your PyCharm Project

Step 3: Write Your Chat Script

Step 4: Run It in PyCharm Terminal

Step 5: Add Personality (Optional)

fukurou

the supreme coder

fukurou

the supreme coder

fukurou

the supreme coder

fukurou

the supreme coder

fukurou

the supreme coder

fukurou

the supreme coder

fukurou

the supreme coder

fukurou

the supreme coder

👨‍💻 dev LLM dev ass

the supreme coder

Step 2: Install Ollama​

Step 3: Set Up Your Python Project​

Step 4: Build a Local API with FastAPI​

Step 5: Customize Your Waifu​

Step 6: Test It!​

the supreme coder

Step 2: Install Ollama (One-Time Setup)​

Step 3: Write Your Chatbot Code​

Step 4: Talk to Your Waifu​

the supreme coder

Step 2: Create Your PyCharm Project​

Step 3: Write Your Chat Script​

Step 4: Run It in PyCharm Terminal​

Step 5: Add Personality (Optional)​

​

the supreme coder

the supreme coder

the supreme coder

the supreme coder

the supreme coder

the supreme coder

the supreme coder

the supreme coder

Step 2: Install Ollama

Step 3: Set Up Your Python Project

Step 4: Build a Local API with FastAPI

Step 5: Customize Your Waifu

Step 6: Test It!

Step 2: Install Ollama (One-Time Setup)

Step 3: Write Your Chatbot Code

Step 4: Talk to Your Waifu

Step 2: Create Your PyCharm Project

Step 3: Write Your Chat Script

Step 4: Run It in PyCharm Terminal

Step 5: Add Personality (Optional)