Toggle navigation
Memorymesh
Home
About
Archive
Memorymesh
"I may not have gone where I intended to go, but I think I have ended up where I needed to be."
Exploring AI Model Formats: A Deep Dive into Llama.cpp, GGUF, GGML, and Huggingface Transformers
Awesome Breakdown Post Some clarifications/findings from the above: GGML isn’t necessarily quantized. I ran the GGML converter on my GPT-J float 16 model. It remained float 16, but it just ran a...
Posted by Aug on November 26, 2023
Its Hard to find an Uncensored Model
Turns out most models are based on data from OpenAI somehow, and this data has guardrails. Found this post on how to finetune a base model after removing all refusals: Based post on making uncens...
Posted by Aug on November 26, 2023
Boosting GPT-J Performance: Converting to GGML for Rapid Inference
I’ve been trying to run inference on a model based on EleutherAI/gpt-j-6B from Huggingface, and it was super slow! The model took about 15 to 30 minutes to respond to my prompt (including model loa...
Posted by Aug on November 26, 2023
← Newer Posts
FEATURED TAGS
Huggingface Transformers
React
Next.js
Client Components
Server Components
Authentication
Context API
GGUF Format
Supabase DB
Web Development
ABOUT ME
Concurrency / Performance
Web3 / AI
Web2.0 Veteran
FRIENDS
Brandon
Ali G