📝 Blog Articles

Programming Intelligence: Using Symbolic Rules to Steer and Evolve AI

🧪 Summary

“What if AI systems could learn how to improve themselves not just at the level of weights or prompts, but at the level of strategy itself? In this post, we show how to build such a system, powered by symbolic rules and reflection.

The paper Symbolic Agents: Symbolic Learning Enables Self-Evolving Agents introduces a framework where symbolic rules guide, evaluate, and evolve agent behavior.

Read more →

Adaptive Reasoning with ARM: Teaching AI the Right Way to Think

Summary

Chain-of-thought is powerful, but which chain? Short explanations work for easy tasks, long reflections help on hard ones, and code sometimes beats them both. What if your model could adaptively pick the best strategy, per task, and improve as it learns?

The Adaptive Reasoning Model (ARM) is a framework for teaching language models how to choose the right reasoning format direct answers, chain-of-thoughts, or code depending on the task. It works by evaluating responses, scoring them based on rarity, conciseness, and difficulty alignment, and then updating model behavior over time.

Read more →

A Novel Approach to Autonomous Research: Implementing NOVELSEEK with Modular AI Agents

Summary

AI research tools today are often narrow: one generates summaries, another ranks models, a third suggests ideas. But real scientific discovery isn’t a single step—it’s a pipeline. It’s iterative, structured, and full of feedback loops.

In this post, I show how to build a modular AI system that mirrors this full research lifecycle. From initial idea generation to method planning, each phase is handled by a specialized agent working in concert.

Read more →

The Self-Aware Pipeline: Empowering AI to Choose Its Own Path to the Goal

🔧 Summary

Modern AI systems require more than just raw processing power they need contextual awareness, strategic foresight, and adaptive learning capabilities. In this post, we walk through how we implemented a self-aware pipeline system inspired by the Devil’s Advocate paper.

Unlike brittle, static workflows, this architecture empowers agents to reflect on their own steps, predict failure modes, and adapt their strategies in real time.


🧠 Grounding in Research

Devil’s Advocate (ReReST)

ReReST: Devil's Advocate: Anticipatory Reflection for LLM Agents introduces a self-training framework for LLM agents. The core idea is to have a “reflector” agent anticipate failures and revise the original plan before executing a powerful method for reducing hallucinations and improving sample quality. Our implementation draws heavily on these ideas to enable dynamic planning and feedback loops within the pipeline.

Read more →

General Reasoner: The smarter Local Agent

🔧 Summary

The General Reasoner paper shows how we can train LLMs to reason across domains using diverse data and a generative verifier. In this post, I walk through our open-source implementation showing how we built a modular reasoning agent capable of generating multiple hypotheses, evaluating them with an LLM-based judge, and selecting the best answer.


🧠 What We Built

We built a GeneralReasonerAgent that:

  • Dynamically generates multiple hypotheses using different reasoning strategies (e.g., cot, debate, verify_then_answer, etc.)
  • Evaluates each pair of hypotheses using either a local LLM judge or our custom MR.Q evaluator
  • Classifies the winning hypothesis using rubric dimensions
  • Logs structured results to a PostgreSQL-backed system

All of this was integrated with our existing co_ai framework, which includes:

Read more →

Building a Self-Improving Chain-of-Thought Agent: Local LLMs Meet the CoT Encyclopedia

Most AI systems generate answers. Ours examines how they think. This isn’t just prompt engineering this is structured reasoning at scale.

🔧 Summary

Large Language Models are transforming every field, yet their internal reasoning remains a formidable black box. We can get brilliant outputs, but without understanding how those conclusions were reached, we’re left guessing how to improve, debug, or even trust them. This opacity limits our ability to build truly reliable and self-improving AI systems.

Read more →

Self-Improving Agents: Applying the Sharpening Framework to Local LLMs

This is the second post in a 100-part series, where we take breakthrough AI papers and turn them into working code building the next generation of AI, one idea at a time.

🔧 Summary

In my previous post, I introduced co_ai a modular implementation of the AI co-scientist concept, inspired by DeepMind’s recent paper Towards an AI Co-Scientist.

But now, we’re going deeper.

This isn’t just about running prompts through an agent system it’s about building something radically different:

Read more →

Building an AI Co-Scientist

This is the fiOr you guys want to scare all of them All right problem this is where I say norst post in a 100-part series, where we take breakthrough AI papers and turn them into working code building the next generation of AI, one idea at a time.

🧾 Summary

In this post, I’ll walk through how I implemented the ideas from
AI Co-Scientist: Towards an AI Co-Scientist into a working system called co_ai.

Read more →

Building Clipper: An AI Image Generator You Control

“If you’ve ever pasted 50 prompts into an image generator one-by-one, this is for you. I hit my limit and built Clipper to solve it.”

📖 Summary

In the previous blog post I wrote a research paper: Cross-Modal Cognitive Mapping. This paper is about turning your conversations into images to gradually map your thought patterns. The implementation of this paper is an application called Prism.

A component of this app is image generation from prompts or your conversations. All of the Foundation models support this but it’s a pretty janky process where you have to generate the prompt paste it into a text box and download the image. I just went through a week of doing this while building a prompt toolkit. While I was doing this I kept wishing I built the app which I’m going to share with you now.

Read more →

Is Freestyle Cognition Real? A Reasoning Models Verdict

“The best way to predict the future is to create it.” A. Lincoln

Summary

When you first hear about Freestyle Cognition, it might sound like just another buzzword:

“Talk to the AI a bit differently. Reflect. Iterate.”

But is there actually a real method underneath?
Or is it just a vibe a way of feeling like you’re doing something smarter?

We put that question to the ultimate test:
We asked a dedicated Reasoning Model to rigorously evaluate Freestyle Cognition, using structured thinking loops (ROW, CRITIC, GROW).

Read more →

What Is Freestyle Cognition

“If you want something new, you have to stop doing something old.” Peter Drucker

Why AI Interaction Needs to Evolve

Today, most people interact with AI like it’s a vending machine:
Type request.
Wait for result.
Repeat.

It’s effective but deeply limited.
It’s like watching someone dig through concrete with their bare hands when a bulldozer sits nearby.

Freestyle Cognition changes that.


The Police Room and the Hidden Truth

You’ve seen it in every great detective movie:
The suspect sits in a bare interrogation room, silent.
But as the conversation unfolds question after question, misdirection after misdirection the suspect begins to talk.
Soon, they reveal truths even they didn’t realize they knew.

Read more →

Cross-Modal Cognitive Mapping: A Technical Overview

Cross-Modal Cognitive Mapping

A Technical Overview of System Design and Implementation

Author: Ernan Hughes
Published: April 2025


Abstract

Cross-Modal Cognitive Mapping is a new framework designed to extend traditional text-based cognition modeling into multimodal representations.
This system combines text prompts, visual generation, human selection behavior, and semantic memory retrieval to better understand and track human conceptual architectures.

This post presents a technical overview of the core architecture, database design, embedding workflows, search functionality, and resonance mapping built during the initial research phase.

Read more →

Uncovering Reasoning in LLMs with Sparse Autoencoders

Summary

Large Language Models (LLMs) like DeepSeek-R1 show remarkable reasoning abilities, but how these abilities are internally represented has remained a mystery. This paper explores the mechanistic interpretability of reasoning in LLMs using Sparse Autoencoders (SAEs) — a tool that decomposes LLM activations into human-interpretable features. In this post, we’ll:

• Explain the SAE architecture used • Compute and visualize ReasonScore • Explore feature steering with sample completions • Provide live visualizations using Python + Streamlit

Read more →

Optimizing Prompt Generation with MARS and DSPy

🕒 TL;DR

  • We explore MARS, a multi-agent prompt optimizer using Socratic dialogue.
  • We implement it using DSPy + Fin-R1 + EDGAR giving us an end-to-end financial reasoning pipeline.
  • We deploy the whole thing to Hugging Face Spaces with a Gradio UI.

🌟 Introduction

Prompt engineering has become the defining skill of the Large Language Model (LLM) era a delicate balance between science and art. Crafting the perfect prompt often feels like an exercise in intuition, trial, and error. But what if we could take the guesswork out of the process? What if prompts could optimize themselves?

Read more →

Fin-R1: a Financial Reasoning LLM with Reinforcement Learning and CoT

Introduction

Fin-R1 is a new model specifically fine-tuned for financial reasoning, with performance that beats much larger models like DeepSeek-R1.

This post will use this model and compare it with phi3 across various tasks.

  • phi3 for comparison

Phi-3: a lightweight, general-purpose model known for its efficiency and strong reasoning performance at smaller parameter scales. It serves as a great baseline for assessing how domain-specific tuning in Fin-R1 improves financial understanding and response structure.

Read more →

MR.Q: A New Approach to Reinforcement Learning in Finance

✨ Introduction: Real-Time Self-Tuning with MR.Q

Most machine learning models need hundreds of examples, large GPUs, and hours of training to learn anything useful. But what if you could build a system that gets smarter with just a handful of preference examples, runs entirely on your CPU, and improves while you work?

That’s exactly what MR.Q offers.

🔍 It doesn’t require full retraining.
⚙️ It doesn’t modify the base model.
🧠 It simply learns how to judge quality — and does it fast.

Read more →

Using Hugging Face Datasets

Summary

Machine learning operates on data. Essentially, it processes data to extract meaningful information, which can then be used to make intelligent decisions. This is the foundation of Artificial Intelligence. The more data you have the better your machine learning apps will be. There is a caveat though the data has to be high quality. The more data you have and the higher quality the better your apps will be.

Read more →

Detecting AI-Generated Text: Challenges and Solutions

Summary

Artificial Intelligence (AI) has revolutionized the way we generate and consume text. From chatbots crafting customer responses to AI-authored articles, artificial intelligence is reshaping how we create and consume content. As AI-generated text becomes indistinguishable from human writing, distinguishing between the two has never been more critical. Here are some of the reasons it is important to be able to verify the source of information:

  • Preventing plagiarism
  • Maintaining academic integrity
  • Ensuring transparency in content creation
  • If AI models are repeatedly trained on AI-generated text, their quality may degrade over time.

In this blog post, we’ll explore the current most effective methods for detecting AI-generated text.

Read more →

Shakespeare and the Bible: An AI Investigation

Summary

Could the greatest playwright of all time have secretly shaped one of the most influential religious texts in history? Some believe William Shakespeare left his mark on the King James Bible hidden in plain sight. With the power of AI, we’ll investigate whether there’s any truth to this conspiracy.

You can read about the conspiracy here:

PostgreSQL for AI: Storing and Searching Embeddings with pgvector

Summary

Vector databases are essential for modern AI applications like semantic search, recommendation systems, and natural language processing. They allow us to store and query high-dimensional vectors efficiently. With the pgvector extension PostgreSQL becomes a powerful vector database, enabling you to combine traditional relational data with vector-based operations.

In this post, we will walk through the full process:

Installing PostgreSQL and pgvector Setting up a vector-enabled database Generating embeddings using Ollama Running similarity queries with Python By the end, you’ll be able to store, query, and compare high-dimensional vectors in PostgreSQL, opening up new possibilities for AI-powered applications.

Read more →

Build Smarter AI: Leveraging the Model Context Protocol for Dynamic Context

Summary

The evolution of technology is driven by protocols structured ways for systems to communicate and interact. The internet, APIs, and even modern databases rely on protocols to function efficiently. Similarly, as AI becomes more powerful, it needs a structured and standardized way to manage context across interactions.

Enter the Model Context Protocol (MCP) a framework designed to enhance the way AI models understand, retain, and utilize context over multiple exchanges. Large Language Models (LLMs) are powerful, but without effective context management, they can:

Read more →

Getting Started with Neo4j: Build Your First Knowledge Graph

Summary

In AI and data science, knowledge graphs are powerful tools for modeling complex relationships between entities. They enable intelligent querying, recommendation systems, and semantic search. Neo4j, an open-source graph database, is one of the most popular tools for building and managing knowledge graphs.

In this post, we’ll walk you through setting up Neo4j, configuring it for use as a knowledge graph, and manipulating the database using Python.

Knowledge Graph Example

Read more →

Beyond Text Generation: Coding Ollama Function Calls and Tools

Summary

Function calling allows Large Language Models (LLMs) to interact with APIs, databases, and other tools, making them more than just text generators.

Integrating LLMs with functions enables you to harness their powerful text processing capabilities, seamlessly enhancing the technological solutions you develop.

This post will explain how you can call local python functions and tools in Ollama.


Introduction to Ollama Function Calling

Ollama allows you to run state-of-the-art LLMs like Qwen, Llama, and others locally without relying on cloud APIs. Its function-calling feature enables models to execute external Python functions, making it ideal for applications like chatbots, automation tools, and data-driven systems.

Read more →

From +AI to AI+: Embracing AI as the Foundation of Everything You Do

Summary

What if AI isn’t just a tool but the foundation of how you work, create, and think?

Today there are two distinct ways to approach AI: +AI and AI+. +AI means integrating AI into an existing workflow, business, or process. It’s the approach discussed in countless books, YouTube videos, and business strategies. If you’re working, running a business, or earning money, this is the natural and logical way to use AI. But this post isn’t about that. Instead this post is about AI+ a radically different approach where AI is the foundation of everything you do.

Read more →

Building AI-Powered Applications with Haystack and Ollama

Summary

In this post, I will demonstrate how to set up and use haystack with Ollama.

haystack is a framework that helps when building applications powered by LLMs.

  • It offers extensive LLM-related functionality.
  • It is open source under the Apache license.
  • It is actively developed, with numerous contributors.
  • It is widely used in production by various clients.

These are some of the key items to watch for when using a library in a project.

Read more →

LiteLLM: A Lightweight Wrapper for Multi-Provider LLMs

Summary

In this post I will cover LiteLLM. I used it for my implementation of Textgrad also it was using in blog posts I did about Agents.

Working with multiple LLM providers is painful. Every provider has its own API, requiring custom integration, different pricing models, and maintenance overhead. LiteLLM solves this by offering a single, unified API that allows developers to switch between OpenAI, Hugging Face, Cohere, Anthropic, and others without modifying their code.

Read more →

TextGrad: dynamic optimization of your LLM

Summary

This post aims to be a comprehensive tutorial on Textgrad.

Textgrad enables the optimization of LLM’s using their text responses.

This will be part of SmartAnswer the ultimate LLM query tool which I will be blogging about shortly.


Why TextGrad?

  • Brings Gradient Descent to LLMs – Instead of numerical gradients, TextGrad leverages textual feedback to iteratively improve outputs.
  • Automates Prompt Optimization – Eliminates the guesswork in refining LLM prompts.
  • Works with Any LLM – From OpenAI’s GPT to local models like Ollama.

What is TextGrad?

Bringing Gradients to LLM Optimization

Traditional AI optimization techniques rely on numerical gradients computed via backpropagation. However in LLM-driven AI systems, inputs and outputs are often text, making standard gradient computation impossible.

Read more →

The Power of Logits: Unlocking Smarter, Safer LLM Responses

Summary

In this blog post

  1. I want to fully explore logits and how they can be used to enhance AI applications
  2. I want to understand the ideas from this paper: “Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering”

This paper introduces a new approach, Selective Question Answering (SQA). This introduces confidence scores to decide when an answer should be given. In this post, we’ll cover the core insights of the paper and implement a basic confidence-based selection function in Python.

Read more →

Efficient Similarity Search with FAISS and SQLite in Python

Summary

This is another component in SmartAnswer and enhanced LLM interface.

In this blog post, we introduce a wrapper class, FaissDB, which integrates FAISS with SQLite or any database to manage document embeddings and enable efficient similarity search. This approach combines FAISS’s vector search capabilities with the storage and querying power of a database, making it ideal for applications such as Retrieval-Augmented Generation (RAG) and recommendation systems.

It builds up this tool PaperSearch.

Read more →

Automating Paper Retrieval and Processing with PaperSearch

Summary

This is part on in a series of blog post working towards SmartAnswer a comprehensive improvement to how Large Language Models LLMs answer questions.

This tool will be the source of data for SmartAnswer and allow it to find and research better data when generating answers.

I want this tool to be included in that solution but I dot want all the code from this tool distracting from the SmartAnswer solution. Hence this post.

Read more →

SQLite: the small database that packs a big punch

Summary

SQLite is one of the most widely used database engines in the world, powering everything from mobile applications (Android, iOS) to browsers (Google Chrome, Mozilla Firefox), IoT devices, and even gaming consoles. Unlike traditional client-server databases (e.g., MySQL, PostgreSQL), SQLite is an embedded, serverless database that stores data in a single file, making it easy to manage and deploy.

Python developers frequently choose SQLite for its inherent simplicity and portability, leveraging the built-in sqlite3 module for effortless database integration.

Read more →

RAFT: Reward rAnked FineTuning - A New Approach to Generative Model Alignment

Summary

This post is an explanation of this paper:RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment.

Generative foundation models, such as Large Language Models (LLMs) and diffusion models, have revolutionized AI by achieving human-like content generation. However, they often suffer from

  1. Biases – Models can learn and reinforce societal biases present in the training data (e.g., gender, racial, or cultural stereotypes).
  2. Ethical Concerns – AI-generated content can be misused for misinformation, deepfakes, or spreading harmful narratives.
  3. Alignment Issues – The model’s behavior may not match human intent, leading to unintended or harmful outputs despite good intentions.

Traditionally, Reinforcement Learning from Human Feedback (RLHF) has been used to align these models, but RLHF comes with stability and efficiency challenges. To address these limitations, RAFT (Reward rAnked FineTuning) was introduced as a more stable and scalable alternative. RAFT fine-tunes models using a ranking-based approach to filter high-reward samples, allowing generative models to improve without complex reinforcement learning setups.

Read more →

Faiss: A Fast, Efficient Similarity Search Library

Summary

Searching through massive datasets efficiently is a challenge, whether in image retrieval, recommendation systems, or semantic search. Faiss (Facebook AI Similarity Search) is a powerful open-source library developed by Meta to handle high-dimensional similarity search at scale.

It’s well-suited for tasks like:

  • Image search: Finding visually similar images in a large database.
  • Recommendation systems: Recommending items (products, movies, etc.) to users based on their preferences.
  • Semantic search: Finding documents or text passages that are semantically similar to a given query.
  • Clustering: Grouping similar vectors together.

In many of the upcoming projects in this blog I will be using it. It is a good local developer solution.

Read more →

K-Means Clustering

Summary

Imagine you have a dataset of customer profiles. How can you group similar customers together to tailor marketing campaigns? This is where K-Means clustering comes into play.

K-Means is a popular unsupervised learning algorithm used for clustering data points into distinct groups based on their similarities. It is widely used in various domains such as customer segmentation, image compression, and anomaly detection.

In this blog post, we’ll cover how K-Means works and demonstrate its implementation in Python using scikit-learn.

Read more →

AI: The Future Interface to Technology

Summary

Imagine a world where you simply think of a task, and invisible devices seamlessly execute it. In fact most of what used to be your daily tasks you won’t even think about they will be automatically executed. Sounds like science fiction? This I believe is the future of human technology interaction. The technology disappears behind an AI driven interface.

Do we currently have Artificial Intelligence

Artificial intelligence refers to computer programs designed to mimic human cognitive abilities, 
such as understanding natural language, recognizing patterns, learning from data, and solving complex problems.
While AGI aims to replicate general human intelligence, 
narrow AI focuses on excelling at specific tasks within predefined parameters.

A common debate in AI discourse revolves around whether large language models (LLMs) truly qualify as artificial intelligence or if they are merely sophisticated algorithms mimicking human-like behavior. While discussions about Artificial General Intelligence (AGI) a theoretical form of AI capable of replicating human cognition across all domains are intriguing, they distract from the practical applications of AI that already exist today. AGI may never materialize, not because it’s unachievable, but because it lacks practical utility. A godlike AI with unrestricted capabilities offers little tangible benefit compared to specialized narrow AI systems. Instead, what we have now is narrow AI, which excels at specific tasks and operates within defined parameters. This AI can get broader through the use of Agents and can automatically self improve and learn as I have shown in previous blog posts.

Read more →

Self-Learning LLMs for Stock Forecasting: A Python Implementation with Direct Preference Optimization

Summary

Forecasting future events is a critical task in fields like finance, politics, and technology. However, improving the forecasting abilities of large language models (LLMs) often requires extensive human supervision. In this post, we explore a novel approach from the paper LLMs Can Teach Themselves to Better Predict the Future that enables LLMs to teach themselves better forecasting skills using self-play and Direct Preference Optimization (DPO). We’ll walk through a Python implementation of this method, step by step.

Read more →

Using Quantization to speed up and slim down your LLM

Summary

Large Language Models (LLMs) are powerful, but their size can lead to slow inference speeds and high memory consumption, hindering real-world deployment. Quantization, a technique that reduces the precision of model weights, offers a powerful solution. This post will explore how to use quantization techniques like bitsandbytes, AutoGPTQ, and AutoRound to dramatically improve LLM inference performance.

What is Quantization?

Quantization reduces the computational and storage demands of a model by representing its weights with lower-precision data types. Lets imagine data is water and we hold that water in buckets, most of the time we don’t need massive floating point buckets to hold data that can be represented by integers. Quantization is using smaller buckets to hold the same amount of water – you save space and can move the containers more quickly. Quantization trades a tiny amount of precision for significant gains in speed and memory efficiency.

Read more →

Mastering LLM Fine-Tuning: A Practical Guide with LLaMA-Factory and LoRA

Summary

Large Language Models (LLMs) offer immense potential, but realizing that potential often requires fine-tuning them on task-specific data. This guide provides a comprehensive overview of LLM fine-tuning, focusing on practical implementation with LLaMA-Factory and the powerful LoRA technique.

What is Fine-Tuning?

Fine-tuning adapts a pre-trained model to a new, specific task or dataset. It leverages the general knowledge already learned by the model from a massive dataset (source domain) and refines it with a smaller, more specialized dataset (target domain). This approach saves time, resources, and data while often achieving superior performance.

Read more →

Debugging Jupyter Notebooks in VS Code

Summary

Visual Studio Code is the most popular editor for development.

Jupyter Notebooks is the most widely used way to share, demonstrate and develop code in modern AI development.

Debugging code is not just used when you have a bug. After you have written any substantial piece of code I suggest stepping through it in the debugger if possible. This can help improve you understanding and the quality of the code you have written

Read more →

DeepResearch Part 3: Getting the best web data for your research

Summary

This post details building a robust web data pipeline using SmolAgents. We’ll create tools to retrieve content from various web endpoints, convert it to a consistent format (Markdown), store it efficiently, and then evaluate its relevance and quality using Large Language Models (LLMs). This pipeline is crucial for building a knowledge base for LLM applications.

Web Data Convertor (MarkdownConverter)

We leverage the MarkdownConverter class, inspired by the one in autogen, to handle the diverse formats encountered on the web. This ensures consistency for downstream processing.

Read more →

DeepResearch Part 2: Building a RAG Tool for arXiv PDFs

Summary

In this post, we’ll build a Retrieval Augmented Generation (RAG) tool to process the PDF files downloaded from arXiv in the previous post DeepResearch Part 1. This RAG tool will be capable of loading, processing, and semantically searching the document content. It’s a versatile tool applicable to various text sources, including web pages.

Building the RAG Tool

Following up on our arXiv downloader, we now need a tool to process the downloaded PDF’s. This post details the creation of such a tool.

Read more →

DeepResearch Part 1: Building an arXiv Search Tool with SmolAgents

Summary

This post kicks off a series of three where we’ll build, extend, and use the open-source DeepResearch application inspired by the Hugging Face blog post. In this first part, we’ll focus on creating an arXiv search tool that can be used with SmolAgents.

DeepResearch aims to empower research by providing tools that automate and streamline the process of discovering and managing academic papers. This series will demonstrate how to build such tools, starting with a powerful arXiv search tool.

Read more →

FFmpeg: A Practical Guide to Essential Command-Line Options

Introduction

FFmpeg is an incredibly versatile command-line tool for manipulating audio and video files. This post provides a practical collection of useful FFmpeg commands for common tasks.

FFmpeg Command Structure

The general structure of an FFmpeg command is:

ffmpeg [global_options] {[input_file_options] -i input_url} ... {[output_file_options] output_url} ...

Merging Video and Audio

Merging video and audio, with audio re-encoding

ffmpeg -i video.mp4 -i audio.wav -c:v copy -c:a aac output.mp4

Copying the audio without re-encoding

ffmpeg -i video.mp4 -i audio.wav -c copy output.mkv

Why copy audio?

Read more →

Writing Neural Networks with PyTorch

Summary

This post provides a practical guide to building common neural network architectures using PyTorch. We’ll explore feedforward networks, convolutional neural networks (CNNs), recurrent neural networks (RNNs), LSTMs, transformers, autoencoders, and GANs, along with code examples and explanations.


1️⃣ Understanding PyTorch’s Neural Network Module

PyTorch provides the torch.nn module to build neural networks. It provides classes for defining layers, activation functions, and loss functions, making it easy to create and manage complex network architectures in a structured way.

Read more →

Mastering Prompt Engineering: A Practical Guide

Summary

This post provides a comprehensive guide to prompt engineering, the art of crafting effective inputs for Large Language Models (LLMs). Mastering prompt engineering is crucial for maximizing the potential of LLMs and achieving desired results.

Effective prompting is the easiest way to enhance your experience with Large Language Models (LLMs).

The prompts we make are our interface to LLMs. This is how we communicate with them. This is why it is important to understand how to do it well.

Read more →

Harnessing the Power of Stable Diffusion WebUI

Summary

In this blog I aim to try building using open source tools where possible. The benefits are price, control, knowledge and eventually quality. In the shorter term though the quality will trail the paid versions. My belief is we can construct AI applications to be self correcting sort of like how your camera auto focuses for you. This process will involve a lot of computation so using a paid service could be costly. This for me is the key reason to choose solutions using free tools.

Read more →

Creating AI-Powered Paper Videos: From Research to YouTube

Summary

This post demonstrates how to automatically transform a scientific paper (or any text/audio content) into a YouTube video using AI. We’ll leverage several powerful tools, including large language models (LLMs), Whisper for transcription, Stable Diffusion for image generation, and FFmpeg for video assembly. This process can streamline content creation and make research more accessible.

Overview

Our pipeline involves these steps:

  1. Audio Generation (Optional): If starting from a text document, we’ll use a text-to-speech service (like NotebookLM, or others) to create an audio narration.
  2. Transcription: We’ll use Whisper to transcribe the audio into text, including timestamps for each segment.
  3. Database Storage: The transcribed text, timestamps, and metadata will be stored in an SQLite database for easy management.
  4. Text Chunking: We’ll divide the transcript into logical chunks (e.g., by sentence or time duration).
  5. Concept Summarization: An LLM will summarize the core concept of each chunk.
  6. Image Prompt Generation: Another LLM will create a detailed image prompt based on the summary.
  7. Image Generation: Stable Diffusion (or a similar tool) will generate images from the prompts.
  8. Video Assembly: FFmpeg will combine the images and audio into a final video.

Prerequisites

  • Hugging Face CLI: Install it to download the Whisper model: pip install huggingface_hub
  • Whisper: Install the whisper-timestamped package, or your preferred Whisper implementation.
  • Ollama: You’ll need a running instance of Ollama to access the LLMs.
  • Stable Diffusion WebUI (or similar): For image generation.
  • FFmpeg: For video and audio processing. Ensure it’s in your system’s PATH.
  • Python Libraries: Install necessary Python packages: pip install pydub sqlite3 requests Pillow (and any others as needed).

**1️⃣ Audio Generation (Optional)

If you’re starting with a text document, you’ll need to convert it to audio. Several cloud services and libraries can do this. For this example, we’ll assume you have an audio file (audio.wav).

Read more →

Fast Poisson Disk Sampling in Arbitrary Dimensions

Summary

In this post I explore Robert Bridson’s paper: Fast Poisson Disk Sampling in Arbitrary Dimensions and provide an example python implementation. Additionally, I introduce an alternative method using Cellular Automata to generate Poisson disk distributions.

Poisson disk sampling is a widely used technique in computer graphics, particularly for applications like rendering, texture generation, and particle simulation. Its appeal lies in producing sample distributions with “blue noise” characteristics—random yet evenly spaced, avoiding clustering.

Read more →

Activation Functions

Introduction

Activation functions are a component of neural networks they introduce non-linearity into the model, enabling it to learn complex patterns. Without activation functions, a neural network would essentially act as a linear model, regardless of its depth.

Key Properties of Activation Functions

  • Non-linearity: Enables the model to learn complex relationships.
  • Differentiability: Allows backpropagation to optimize weights.
  • Range: Defines the output range, impacting gradient flow.

In this post I will outline each of the most common activation functions how they are calculated and when they should be used.

Read more →

SVM Support Vector Machine an introduction

Summary

In this post I will implement a Support Vector Machine (SVM) in python. Then describe what it does how it does it and some applications of the instrument.

What Are Support Vector Machines (SVM)?

Support Vector Machines (SVM) are supervised learning algorithms used for classification and regression tasks. Their strength lies in handling both linear and non-linear problems effectively. By finding the optimal hyperplane that separates classes, SVMs maximize the margin between data points of different classes, making them highly effective in high-dimensional spaces.

Read more →

Color wars: Cellular Automata fight until one domiates

Summary

This post is about color wars: a grid containing dynamic automata at war until one dominates.

Implementation

The implementation consists of two core components: the Grid and the CellularAutomaton.

1️⃣ CellularAutomaton Class

The CellularAutomaton class represents individual entities in the grid. Each automaton has:

  • Attributes: ID, strength, age, position.
  • Behavior: Updates itself by aging, reproducing, or dying based on simple rules.

2️⃣ Grid Class

The Grid manages a collection of automata. It:

Read more →

More Machine Learning Questions and Answers with Python examples

44. What does it mean to Fit a Model?

Answer
Fitting a model refers to the process of adjusting the model’s internal parameters to best match the given training data. It’s like tailoring a suit – you adjust the fabric and stitching to make it fit the wearer perfectly.

Key Terms:

  1. Model: A mathematical representation that captures patterns in data. Examples include linear regression, decision trees, neural networks, etc.

  2. Parameters: These are the internal variables within the model that determine its behavior. For instance:

    Read more →

Cellular Automata: Traffic Flow Simulation using the Nagel-Schreckenberg Model

Summary

The Nagel-Schreckenberg (NaSch) model is a traffic flow model which uses used cellular automata to simulate and predict traffic on roads.


Design of the Nagel-Schreckenberg Model

  1. Discrete Space and Time:

    • The road is divided into cells, each representing a fixed length (e.g., a few meters).
    • Time advances in discrete steps.
  2. Vehicle Representation:

    • Each cell is either empty or occupied by a single vehicle.
    • Each vehicle has a velocity (an integer) which determines how many cells it moves in a single time step.

Rules of the Model:

  • The NaSch model uses local rules to update the state of each vehicle at every time step. These rules are:
  1. Acceleration:

    Read more →

Simulate Gastropod Shell Growth Using Cellular Automata

Summary

I started with this paper A developmentally descriptive method forquantifying shape in gastropod shells and bridged the results to a cellular automata approach.

An example of the shell we are modelling: Shell Shape

Steps

1️⃣ Identify the Key Biological Features

The paper outlines the logarithmic helicospiral model for shell growth, where:

  • The shell grows outward and upward in a spiral shape.
  • Parameters like width growth (\(g_w\)), height growth (\(g_h\)), and aperture shape dictate the final form.

These features describe how the shell expands over time in a predictable geometric pattern.

Read more →

Cellular Automata: Introduction

Summary

This page is the first in a series of posts about Cellular Automata.

I believe that we could get the first evidence of AI through cellular automata.

A recent paper Intelligence at the Edge of Chaos found that LLM’s trained on more complex data generate better results. Which makes sense in a human context like the harder the material is I study the smarter I get. We need to find out why this is also the case with machines. The conjecture of this paper is that creating intelligence may require only exposure to complexity.

Read more →

Rag: Retrieval-Augmented Generation

Summary

Retrieval-Augmented Generation (RAG) is a powerful technique that enhances large language models (LLMs) by allowing them to use external knowledge sources.

An Artificial Intelligence (AI) system consists of components working together to apply knowledge learned from data. Some common components of those systems are:

  • Large Language Model (LLM): Typically the core component of the system, often there is more than one. These are large models that have been trained on massive amounts of data and can make intelligent predictions based on their training.

    Read more →

CAG: Cache-Augmented Generation

Summary

Retrieval-Augmented Generation (RAG) has become the dominant approach for integrating external knowledge into LLMs, helping models access information beyond their training data. However, RAG comes with limitations, such as retrieval latency, document selection errors, and system complexity. Cache-Augmented Generation (CAG) presents an alternative that improves performance but does not fully address the core challenge of small context windows.

RAG has some drawbacks - There can be significant retrieval latency as it searches for and organizes the correct data.
- There can be errors in the documents/data it selects as results for a query. For example it may select the wrong document or give priority to the wrong document. - It may introduce security and data issues 2️⃣.
- It introduces complication
- an external application to manage the data (Vector Database) - a process to continually update this data when the data goes stale

Read more →

Agents: A tutorial on building agents in python

LLM Agents

Agents are used enhance and extend the functionality of LLM’s.

In this tutorial, we’ll explore what LLM agents are, how they work, and how to implement them in Python.

What Are LLM Agents?

An agent is an autonomous process that may use the LLM and other tools multiple times to achieve a goal. The LLM output often controls the workflow of the agent(s).

What is the difference between Agents and LLMs or AI?

Agents are processes that may use LLM’s and other agents to achieve a task. Agents act as orchestrators or facilitators, combining various tools and logic, whereas LLMs are the underlying generative engines.

Read more →

Courses: Free course on Agentic AI

Some FREE AI courses on Agents I recommend doing

Agents were important in Machine Learning development last year.

These are some courses I have done and recommend they are all free.
A good reason to do courses and look at youtube videos is you will learn current applications of AI and may get ideas for new applications.

AI Agentic Design Patterns with AutoGen

AI Agentic Design Patterns with AutoGen

Topics: Agents, Microsoft, Autogen.

Read more →

Ollama: The local LLM solution

Using Ollama


Introduction

Ollama is the best platform for running, managing, and interacting with Large Language Models (LLM) models locally. For Python programmers, Ollama offers seamless integration and robust features for querying, manipulating, and deploying LLMs. In this post I will explore how Python developers can leverage Ollama for powerful and efficient AI-based workflows.


1️⃣ What is Ollama?

Ollama is a tool designed to enable local hosting and interaction with LLMs. Unlike cloud-based APIs, Ollama prioritizes privacy and speed by running models directly on your machine. Key benefits include:

Read more →

Hugo: A Static Site Generator

Hugo: A Static Site Generator

In this post I give an introduction to what I think is the best static site generator: Hugo.

What is Hugo?

Hugo is an open-source static site generator written in Go. It takes structured content, often written in Markdown, and compiles it into static HTML, CSS, and JavaScript files.


Setting Up Hugo: A Quickstart Guide

Follow these steps to set up your first Hugo site

**1️⃣ Install Hugo

First, ensure you have Hugo installed. Use your package manager of choice:

Read more →

ChromaDB: The Lightweight Open-Source Vector Database for AI Applications

1️⃣ Introduction

In the era of AI-powered search, retrieval-augmented generation (RAG), and recommendation systems, efficient vector search is a necessity. While many vector databases exist, most require heavy infrastructure.

Enter ChromaDB: a lightweight, open-source vector database optimized for rapid prototyping and local AI applications.

2️⃣ What is ChromaDB?

Definition:

ChromaDB is a vector database for storing and querying embeddings. It provides an easy-to-use interface for AI developers to integrate similarity search into their applications.

Read more →

Project 8: acdb a super fast database for Android

An android implementation of [CDB] (https://cr.yp.to/cdb.html) database. With some simple testing I am seeing a five to ten times increase in speed over Sqlite

Read more →

Project 7: FX-Trader

A derivatives trading system. This is my attempt at building a derivatives trading system. In this first post I am going to outline the goals of the project and some of the early design decisions.

Read more →

Project 6: Validator

A validation tool for excel files. Sometimes you need to export data from one system for loading into another. For instance you may export a report from a derivatives trading system for information for a collateral management system.

This Excel macro file validates input files to make sure that they are in a specific format. The input file can be in any format that excel can load. The workbook will load the file and check for errors in place.

Read more →

Project 5: Dictator

Google have just released their speech API. One really cool feature is the ability to transcribe voice in real time. Two years ago I built an app with this idea in mind. At that at the time I could no make it work. Now it is time to resurrect that app. This post will cover the recording section of that application.

Read more →

Project 4: Meth

This is another simple android application. It does one thing It keeps your phone awake.

Read more →

Project 3: Site Shot

This is a really simple android application. It does one thing It takes a photo of a web site and allows you to share the photo.

Read more →

A script to generate android images

This is a script I use in my android projects to generate different sized images.

You can find the script here process.vbs

So I build all my images in inkscape.

This is brilliant application.

You can find some brilliant tutorials here heathenx

The script uses inkscape to convert the svg files to png images of different sizes.

The script also use imagemajic to format the pngs nicely.

Finally it compresses the result using two png crushing programs

Read more →

A simple android log class

This is a very simple log class I reuse in my projects

It is a hybrid of Timber by Jake Wharton and the Log in Android Universal Image Loader by Sergey Tarasevich

Read more →

Everything

I think that this tool is the best search tool for windows.

Everything

Read more →

An android SharedPreferences wrapper class

This is a wrapper around the Android SharedPreferences

It adds a few useful extensions

Read more →

Tools I use for this Blog

In this post I am going to share some of the tools I currently use to build the blog.

Read more →

Project 2: File Explorer for android

In this project I am building a file explorer library for android. As I was working on catcher it became obvious I would need a file picker and explorer solution. So I did a bit of looking on the web. I found three interesting projects that nearly did what I wanted. I put a few of them together to come up with a hybrid solution.

Read more →

Project 1: Catcher

This is an android application to transfer files from your phone to somewhere else. I will be built as a PC solution but can be used for a server solution also.

Read more →

Table of Contents

  1. Introduction to LLM Agents
  2. Methodologies and Core Patterns
  3. Construction - Building the Agent
  4. Collaboration - Multi-Agent Systems and Interaction
  5. Introspection, Memory, and Interpretability
  6. Applications in the Real World
  7. Agents That Enhance AI Itself
  8. Advanced Architectures and Coordination
  9. Challenges, Anti-Patterns, and the Future of Agent Design
  10. Conclusion and Final Thoughts

Chapter 1: Introduction to LLM Agents

What is an LLM Agent

An LLM agent is a software system built around a large language model (LLM) that can autonomously perform tasks by combining language generation with reasoning, memory, and external tools. Unlike traditional LLMs that simply respond to prompts, LLM agents maintain context, plan their actions, and interact dynamically with their environment. This allows them to handle more complex tasks and workflows independently.

Read more →

AI Is the Interface: The Future of Human-Technology Interaction

Technology is the bridge that transforms data into knowledge.

In the coming years, artificial intelligence will evolve from being a tool that assists humans to becoming the primary interface through which we interact with technology and process information. The future of human-computer interaction will not be through keyboards, touchscreens, or even direct programming—it will be mediated by AI systems that understand, interpret, and execute our intentions seamlessly.

Read more →