Memorymesh

"I may not have gone where I intended to go, but I think I have ended up where I needed to be."

Enhancing AI with RAG: Leveraging Haystack for Smarter Agents and Efficient Data Retrieval

What’s RAG (Retrieval Augmented Generation)? I define RAG as any process that adds domain specific info to the prompt being sent to an LLM. This means you can enhance an LLM with additional knowl...

Posted by Aug on February 1, 2024

Exploring Haystack: Building Advanced NLP Applications with LLMs and Vector Search

According to their github: Haystack is an end-to-end LLM framework that enables you to build applications powered by LLMs, Transformer models, vector search and more. Whether you want to perform...

Posted by Aug on January 21, 2024

Exploring AI Model Formats: A Deep Dive into Llama.cpp, GGUF, GGML, and Huggingface Transformers

Awesome Breakdown Post Some clarifications/findings from the above: GGML isn’t necessarily quantized. I ran the GGML converter on my GPT-J float 16 model. It remained float 16, but it just ran a...

Posted by Aug on November 26, 2023

Its Hard to find an Uncensored Model

Turns out most models are based on data from OpenAI somehow, and this data has guardrails. Found this post on how to finetune a base model after removing all refusals: Based post on making uncens...

Posted by Aug on November 26, 2023

Boosting GPT-J Performance: Converting to GGML for Rapid Inference

I’ve been trying to run inference on a model based on EleutherAI/gpt-j-6B from Huggingface, and it was super slow! The model took about 15 to 30 minutes to respond to my prompt (including model loa...

Posted by Aug on November 26, 2023