Everything AI – TOC

This blog has plenty of posts about AI, some are about AI tools, others are about installing AI locally, so this post is where I am putting all the AI stuff I have ever blogged about in one place !

The section Local AI is about creating your own AI server using freely available sources, the API section lists all the services that provide an API that can be used remotely (Most can not be installed locally anyway), and the Online Services is where you can get things done via AI online (That can be used using your browser, whether they provide an API or not, and whether they can be installed locally or not is a different story)

Continue reading “Everything AI – TOC”

Failing spark plugs

My 2014 Toyota Prius uses iridium-tipped spark plugs such as the Denso SK16R11 or the NGK IFR5A11, both usually set you back around $20, but if you buy them from the official toyota dealer, they will set you back $150 ! but I have no choice, as the market is flooded with counterfeit spark plugs, and changing the spark plugs on the prius requiers the removal of the wiper area !

What about those $150 spark plugs (I will check the exact price and update this soon, so when I took them out after less than 5000 KMs and found two of them broken ! specifically the Insulator Nose (shield) around the copper core !

I had a failing head gasket, and the removal was part of fixing that, so I thought that the seeping oil or water into the piston might have been the problem, but since i maintained their order after removing them, It turns out that the one that has the inner shield fully broken was in a piston that did not suffer any head gasket leak !

Prius 2014 head gasket replacement

Seems the Prius is known to blow head gaskets, reason being attributed to the hybrid system, the engine switches off and on so often, heating up and cooling down all the time.

I am using the official TIS repair manuals. I am warning you, this is a lengthy job, it is not something you can finish in a day.

Tools

There are plenty of tools that you will need along the way, everything from the tools needed to take out the EGR, to a triple square bit to remove the cylinder head bolts, in any case, here is a list of tools i recall needing

  • A ratchet, breaker bar, and a cheater bar
  • A torque wrench (I use ACDelco’s ARM302-4s)
  • Oil filter removal tool
  • 10mm bi-hexagon wrench (An M12 (12mm triple square) bit will do if you fail to obtain the bi-hexagon ! but you will feel a bit of wiggle in it)
  • A pair of pliers
  • 12mm socket (Many screws)
  • 14mm socket (Many screws)
  • 19mm socket (The harmonizer / crank pulley)
  • Feeler gauges down to 0.004 (0.004 is the maximum acceptable engine warp)
  • A super straight bar to assess the engine against (Laser cut)
  • Harmonic Damper removal tool such as Schley 64300 (I made my own simple tool that works)
  • A tool to take out the valve stem seals (I made my own)
  • Patience

The Harmonic Damper removal tool

There are many ways to remove the pulley (Harmonic Damper), 1- the most common of which is using a torque impact drill, it just works, but my gut feeling tells me it might not be wise, if it was, Toyota would have approved it, 2- another popular way is to use a belt or strap and tie it to a strong part of the vehicle and around the pulley to use friction to hold the pulley in place, problem with this method is that if it is not 100% horizontally aligned, and the pulley is being pulled in or out, it might cause damage, 3- and there is the Toyota way, which is based on using a tool to hold the pulley in place while you loosen the screw, examples of such a tool is the Schley Products 64300, I don’t have access to such a tool, so I improvised a tool that holds the pulley against the engine mount

Parts

Distilled water

7 Liters of distilled water, You are probably going to need to use distilled water before engine coolant to flush the engine from the contaminated coolant in the car, it was contaminated because your head gasket broke, and exhaust went through that coolant.

Engine Coolant

(Applies to both Fix and repair)
If the coolant has been contaminated with engine exhaust, you will need to flush and add distilled water, then after a day (a day of light use, and zero days of the car sitting), remove as much of the water as you can and add the SLLC coolant (Do not run on distilled water for more than 1 day).

Totota’s coolant is super expensive, if you are looking for something other than the OEM, or want to understand the risks and advantages, check out my Toyota’s coolants post

The car takes 6.5 liters of coolant, every container is 2 liters !

Oil and oil filter

(Applies to both Fix and repair)

if the engine oil has been contaminated (And it has), you will need a new filter and new oils

Expected cost (35JD)

Gasket kit

PN: 04111-37316

The repair manual states that you will need a new gasket kit, the gasket kit has the following parts.

FIPG 103

PN: 00295-00103

The form in place gaskets from Toyota are best, but they are very pricey, I went with an aftermarket German brand (Victor Reinz Reinzosil) that is high quality (Used by German brands as OEM !) and have my fingers crossed, the Toyota FIPG costs around $50, the Victor Reinz Reinzosil (Universal) aftermarket brand costs less than 6 dollars

Cylinder head bolts

(90910‑02164 and later was superseded by 90910‑A2009)

Those are Torque to yield bolts, when you install them, they are stretched, and should not be used again, In my case unfortunately, Al markazeyah Toyota does not provide them, I was told today that they did have them at one point, and they cost $5 each (10 of them is $50), but they no longer carry them ! yes, I know, what the duck, as an exclusive dealer in Jordan, they absolutely should have them.

The Toyota repair manual often tells you to measure the bolts after removal, and replace them if they are out of spec — a classic sign of TTY design, even if not spelled out.

The process

Removing the spark plugs

Getting started with a big surprise !

I started by removing the spark plugs, like you normally would, and to my surprise, those brand new spark plugs (Less than 5000 KMs) are broken, and the ones that were broken are not even the ones where the head gasket leaked into ! Here is more about the failing spark plugs

Toyota’s coolants

The following is a result of me checking the right way to flush my engine of contaminated coolant in my 2014 prius, but it applies equally to modern Toyotas. my coolant was contaminated due to a blown head gasket !

The lowdown: fill it with distilled water, use it for a day in light traffic (Don’t let the water sit, you have to fill it and use it for a day, if the weather is hot, you should not stress it as water has less cooling capacity), then, flush as much as you can from those 6.5 liters of distilled watter, and once you have removed 5 or more liters, fill it up with SLLC (Not a drop of LLC is allowed)

The coolant from Toyota for modern cars (2004+) is super expensive ($21 for every 2L), my 2014 Prius for example has 10 liters of coolant in both engine and inverter loops, that would set me back around $100,

The coolant suitable for such new cars is called the SLLC (Super long life coolant), it comes premixed 50/50 (With distilled water)

Can i mix SLLC with the older LLC ?

You certainly can not ! they are chemically incompatible, You are not free to mix and match.

SLLC is compatible with both inverter and aluminum engine cooling systems, LLC is not, they have different chemical formulations and should not mix in any way. here is what is given by Toyota in the MSDS documents, obviously done by Toyota in a way to keep it ambiguous and keep others from copying the formulation

Long Life (LLC)Super Long Life (SLLC)
Ethylene Glycol 107-21-1. 87% – 95%Ethylene Glycol 107-21-1. 45% – 50%
Diethylene Glycol 111-46-6. Less than 5%
Hydrated inorganic acid, organic acid salts. Less than 5%Hydrated inorganic acid, organic acid salts. Less than 5%
Water. Less than 5%Water. Less than 45% – 50%
Bittering agent (Trace amounts)
NOTES: NOTES: This is P-OAT chemistry.
OAT = Organic Acid Technology
“P-OAT” or “Phosphated OAT”

So what alternative coolants can be used ?

The coolant compatible with your aluminum engine is a P-OAT, which in Toyota terms is called the super long life, if you find the brands “AISIN or Zerex”, that is the OEM for the SLLC, as for the LLC the OEM is probably castle, As they have a product that comes in an identical container, but that is not definitive proof.

Things you DON’T need to know
LLC *(The one you should not use on modern cars) is not premixed, you add 50% distilled water, but this is for older vehicles.

Remotely controlling the S10+

I have been using a google pixel 6 pro for some time now, excellent phone, and it already got the android 16 update, so thanks to google for extending our support ! rumor has it that with this extended support, we will also get android 17 !

Now, my old Samsung S10+ that I had before the pixel 6 has a broken screen, the upper side of the screen works, but the touch is broken, the lower 30% of the screen are black but the touch screen works, it sustained this injury when a bottle of vodka went crashing down on the screen when the phone was fairly new ! the poor thing never stood a chance, and was only used for a few months (if even that)

Anyway, to avoid keeping my phone occupied when I am taking a video, I have been using a xiaomi note 4, not bad a camera, but the S10+ is obviously a better option for this task

So, how can I use it, turns out there is open source software called scrcpy that should be able to do both jobs, mirror the screen on a PC or another phone, and send the input to the phone ! scrcpy should be able to connect to the PC through WIFI (If your phone is recent enough) or USB (Whatever is ADB compatible)

Installing scrcpy on my PC

NOTE: I use Gnome on a Linux Debian machine, but installing it on windows should be straight forward !

sudo apt install scrcpy adb -y

This installs both Scrcpy and Android Debug Bridge (ADB), which are needed to communicate with your phone.

NOTE: The above does not work in bookworm as there is no such package (Both previous and next versions do have it), so instead, if you are on bookworm, follow the steps below

Bookworm installation

sudo apt install git ffmpeg libsdl2-dev adb gcc make meson ninja-build pkg-config libavcodec-dev libavformat-dev libavutil-dev libusb-1.0-0-dev libavdevice-dev adb

sudo apt install openjdk-17-jdk gradle

Now, we need the SDK !
mkdir -p ~/Android && cd ~/Android
wget https://dl.google.com/android/repository/commandlinetools-linux-11076708_latest.zip
unzip commandlinetools-linux-*.zip -d cmdline-tools
cd cmdline-tools
mv cmdline-tools latest
NOTE: You will have the folder cmdline-tools inside the folder cmdline-tools so we renamed the inner to latest

vi ~/.bashrc
export ANDROID_HOME=$HOME/Android
export PATH=$ANDROID_HOME/cmdline-tools/latest/bin:$ANDROID_HOME/platform-tools:$PATH

source ~/.bashrc

Now, you need to accept all the license agreements

~/Android/cmdline-tools/latest/bin/sdkmanager --licenses

cd ~
mkdir src
cd ~/src

git clone https://github.com/Genymobile/scrcpy

cd scrcpy

meson setup build
ninja -C build
sudo ninja -C build install

Now, I assume you know how to connect your phone to your PC via ADB, if not, there are tutorials all over the internet… You start by enabling developer tools mode on the phone by tapping the build number 7 times then enable debug (Either USB or WIFI or both)

Once that is done, you can get a clone of the screen with the following

adb devices

scrcpy -s device1_serial

You don’t need the -s serial part if there is only 1 device

Also, you can run multiple instances

Installing Whisper AI to a an entry level GPU

Obviously, as you can see from my blog, I have a bunch of high end GPUs for my AI work, the GPU I use on my daily driver PC on the other hand is a complete joke (Nvidia rtx 1650) with 4GB of ram… Not exactly a GPU you would use for anything remotely demanding

But running whisper on my local machine is very convinient, the audio files are already there, no need to login to any remote machines and the like, so i will be installing a small version of whisper here, and let us see how this ancient GPU does

1- I already have Python 3.12.7 installed, if you don’t, then “sudo apt install python3 python3-pip”

python3 -m venv whisper-env

And activate it

source whisper-env/bin/activate

Now, before you procede, if you want your “HuggingFace” directory on a different drive or something (Where the models actually live), you should start by adding the following line to ~/.bashrc or whatever your system uses, also remember to either run (source ~/.bashrc) or to close and open your terminal again for the changes to take effect

export HF_HOME=/mnt/bigdrive/huggingface

Now, let us go ahead and install faster-whisper

pip install faster-whisper

Also, make sure PyTorch with GPU support is available:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Test GPU availability:

python3 -c "import torch; print(torch.cuda.is_available())"

Now, I thought tiny would be the correct size that suits my GPU, but it turned out “mini”base” works just fine !

faster-whisper sample.wav --model-size base --compute-type float16

How do we know if we are hitting the GPU limits ?

watch -n 1 nvidia-smi

BeautifulSoup

BeautifulSoup is a python package that allows you to extract data from HTML files, it is very easy and intuitive

Let us assume you have an HTML page !

First, let us assume you want the title from that HTML page….

mysoup = BeautifulSoup(response.content, 'html.parser')
title = soup.title.string if soup.title else "No title found"

Now, assuming you want to remove everything that has to do with CSS and presentation, you can remove the following things with this easy code snippet, then putting whatever is lef in a variable called text

for irrelevant in mysoup.body(["script", "style", "img", "input"]):
irrelevant.decompose()
text = soup.body.get_text(separator="\n", strip=True)

Gnome calculator freezes instantly

Well, the calculator that ships with Gnome would launch instantly, then freeze instantly, I would then have to keep clicking the (x) in the corner and wait for the OS to suggest killing the app

No matter how long I wait on the app, it won’t budge, you can’t even move it around the screen

Turns out the calculator works if I am not connected to the internet (Pull the Ethernet cable out)

The solution was simple, it seems the calculator freezes waiting for data from the internet, when there is no internet, it skips the step

The solution was simple, it goes online for currency conversion rates, something i never use, so the solution is to execute the following command to stop it from doing that

dconf write /org/gnome/calculator/refresh-interval 0

Yup, that is it, the calculator works perfectly now

All about hard drive cache

How does a hard drive cache work EXACTLY

The short answer is, EXACTLY, no one knows, how a hard drive cache works is a manufacturer secret and differs from drive to drive depending on the drive’s purpose, BUT, we have a lot of clues, some through the SATA specification (And PATA), others through industry standard commands, and it is also not so hard to get what we want from black box reverse engineering, we might not get the actual algorithm (or variant of the algorithm) from such an endeavor, but we can know enough to predict how it will work

Hard drives are not simple machines in any sense of the word, as soon as you are familiar with them, and if you are familiar with computer science, specifically algorithms, you will come to conclusions concerning where complexities lie ! and it is not all in the hardware, much of it is in the hard drive’s software (Firmware)

The hard drive’s raison d’être

You see, a hard drive spins at a certain speed (Most commonly 5400 or 7200 rpm), some spin even faster, the hard drive has to do all it can to do what it is asked in the most efficient way, so for example, it allows the OS (through the controller’s driver) to tell it all about what data it wants in advance so that it can plan the heads shortest path to getting all that data (Native Command Queuing and before it Tagged Command Queuing), but let us not get carried away here, we are here to find out how cache works ! NCQ is a topic for a different day (Or is it)

Im here for the recipes

There are very few recipes and interactions that you are able to make use of, but let me try to come up with the most common ones you will probably want.

IMPORTANT: please note that all this is lost when you switch your computer off, to make this stuff permanent, you will need to add them to /etc/rc.local or use udev rules

write caching

First, here are the commands to probe for state, enable and disable

# Check status (=0 means disabled)
sudo hdparm -W /dev/sdX
# Enable
sudo hdparm -W1 /dev/sdX
# Disable
sudo hdparm -W0 /dev/sdX

read ahead caching

First, here are the commands to probe for state, enable and disable

# Check for state (Zero means disabled, other values are sectors to cache)
sudo hdparm -a /dev/sdX
# Enable (Ask for a 256 sector read ahead)
sudo hdparm -a 256 /dev/sdX
# Disable
sudo hdparm -A 0 /dev/sdX

Operating system level caching for a device

# Set read ahead for a disk into ram (Unit: Memory blocks)
blockdev --setra xxx /dev/sda
# Set write caching in system memory (Percentage of ram)
echo 10 > /proc/sys/vm/dirty_ratio
# Fstab entry to create a hard drive (Block device) in RAM (percentage or size Ex: 20G)
tmpfs /mnt/tmpfs tmpfs size=50%,rw,nosuid,nodev 0 0

In this day and age, do we still need spinning hard drives anyway ?

Well, yes, and no, in my case, I burn through hard drives and SSDs very quickly, but with a little tweaking, hard drives live a bit longer (Can only be achieved by also managing the vibration of multiple disks with a heavy computer case, but that is a topic for a different post), my use case is all about continuous writing, SSDs don’t seem to like this.

If this does not apply to you, and SSD cost is what is stopping you from going all in SSD, then maybe you would be interested in a post about adding an SSD caching layer in front of your inexpensive spinning disk

Why this is important to me (and you)

It is important to me because I have a mysql database spread across a big bunch of spinning disks, those disks are being written to ALL THE TIME, and this is precisely why using SSDs here is a bad idea, the data is short lived but the drive is hammered with writes continuously !

I am not saying that hard drives don’t take a considerable hit when they are hammered with writes continuously, but a disk constantly busy seeking while writing vs a disk writing sequentially do not bear the same kind of penalty, in fact, from my experiments, a hard disk with a write load designed to destroy it, will last much less than an SSD ! and the hit on SSDs also depends on the workload (Check write amplification), so yeah, this subject can get out of hand quickly

Is a hard drive’s cache used for reading or writing

Both, you will be told online (On some very authoritative popular places) that it is mostly for reading, but I fail to see what that means, it is mostly for whatever you are doing more ! Here is a bad example, It’s as if you are asking if a dolly is more concerned with sending goods to the truck or bringing them from the truck to the warehouse, it depends on whether you are loading or unloading the truck

Why is this a bad example you ask, well, because a hard drive is not a dolly that is being used to unload a truck, operating systems and database engines and hard drives are not a sheet of metal on 4 wheels (More like a sheet of oxidized metal on one bearing, but that is besides the point), A database operation will typically require many reads before it does any writes, and those reads are also handled by the database engine’s cache and the operating system’s cache, you get the idea and complexity…. but this still doesn’t mean that cache is concerned with reads more than writes or the other way around. it will depend on your workload, and on the correct disk firmware for that workload (EX: WD purple vs WD Blue, VS WD Black for example).

the firmware will always determine the priorities of the disk when caching, so certain firmwares will lean towards caching writes over reads while other firmwares will do the opposite.

NCQ already !

Well, since me and my big mouth already got us into NCQ, let me start with that and get it out of the way

NCQ is not possible without a chache, the cache is used to

  • Store operating system’s requests, reordering them according to their locations on the disk, and fetch them
  • Some requests may be served immediately from the cache before that cache is overwritten
  • Write Coalescing and Deferred Writes, writes can be “acknowledged” before being written and wait their turn to truly be written, and are only written to disk when they are combined into a larger write for optimization (There is a feature in NCQ that allows the OS to know if it was written to the disk or just the cache, but you don’t need that in your applications, you shouldn’t care)

Okay, so let us get back to what we were saying….

Hard drive cache for reading

hard drive designers are certainly well aware of the operating system’s cache in ram, so what good could come from caching in a measly 64MBs on the disk

this is a very good question, you see the operating system will not attempt to read neighboring areas of the disk just because they have zero overhead, but the disk will, it is free potential prefetch so why wouldn’t it fill its cache with it

There are many reasons why it would and why it would not, the cache size is limited, so there are priorities to what gets done with this cache, but also, the required processing is not little, so you don’t want to push that hard drive processor making a bottleneck out of it, remember when western digital came out with their black series and promoted them as having 2 processors (Micro-controllers is probably the correct term, but why complicate the jargon), that is because there is plenty of processing tasks to be done ?

So let us get to the reading business, if you ask AI, you will get very outdated or irrelevant data, when i asked AI, it seems to return advantages that are nulled by operating system disk-to-ram caching, so let me tell you what is still true and what is not

  1. Prefetching and Read-Ahead Optimization also known as (read-lookahead feature) and (read-ahead caching): Since the hard drive has knowledge of its own physical layout and access patterns, it can intelligently prefetch adjacent data into cache. Unlike the operating system, which only caches frequently used files or blocks, the hard drive itself can anticipate sequential reads and load data preemptively at a very little to no overhead (because it is reading data in the head’s way mostly). This is particularly useful for sequential reads (Mostly contiguous) . the drive itself has the facility to detect whether the read is sequential or not from the request addresses, SO TO AVOID LOST SPINS DON’T COMPLETELY DISABLE IT… MAKE IT LOWER IF YOU MUST, EXPERIMENTATION ON THE BEST SIZE IS KEY
  2. Interaction with OS-Level Caching: While the operating system also caches data in RAM, the drive’s internal cache is the first line of defense against performance bottlenecks. The OS might not always know the drive’s specific access patterns, whereas the drive’s firmware can optimize for known workloads in real-time.
  3. Adaptive Algorithms: Some hard drives (probably all modern ones) employ adaptive caching techniques, where they analyze access patterns over time and adjust caching strategies accordingly. For example, a drive may increase its read-ahead buffer if it detects frequent sequential reads but prioritize different caching strategies when dealing with random access patterns.

Hard drive cache for writing

Writing to a hard drive is not as straightforward as it might seem. The cache plays a crucial role in optimizing write performance and improving the overall lifespan of the drive. When data is written to a hard drive, it doesn’t necessarily go straight to the platters. Instead, the cache temporarily holds this data before it is written in an optimized manner.

This is beneficial for a few reasons:

  1. Write Coalescing: The hard drive can combine multiple small write requests into a single, larger, more efficient write operation. This reduces the number of disk rotations required to complete a task.
  2. Reducing Latency: If an application writes small amounts of data frequently, the cache allows the drive to acknowledge the write operation almost instantly before the data is physically committed to the disk.
  3. Deferring Writes: Some writes can be held in cache temporarily, allowing the drive to prioritize more urgent tasks before actually writing the data to disk.

However, this raises an important issue: data integrity. Since data is often held in volatile cache before being written permanently, there is always a risk of data loss in the event of a power failure or unexpected system shutdown. To mitigate this, many enterprise-grade drives implement write-through caching or battery-backed cache systems that ensure data is not lost before it is written.

Does Cache Improve Write Speed?

Yes, but only under certain conditions. For bursty, short writes, the cache significantly improves performance because the hard drive doesn’t have to immediately seek and rotate to a specific position on the disk. Instead, it temporarily holds the data and commits it at an optimal time. However, for sustained, sequential writes that exceed the cache size, the drive eventually has to flush the cache and write directly to disk, which means the cache offers diminishing returns.

Another critical aspect to consider is firmware tuning. Some manufacturers optimize their firmware for different workloads. Consumer drives often prioritize read-heavy workloads, while enterprise drives optimize caching strategies for sustained writes and improved data integrity.

Cache Eviction and Management

Since cache size is limited (typically between 8MB and 256MB on modern drives), the firmware must decide what stays in cache and what gets discarded. The general approach follows:

  • Least Recently Used (LRU): Frequently accessed data is kept in cache, while older, less-used data is replaced.
  • Write Prioritization: If a large sequential write is detected, the drive may flush other cache contents to prioritize this operation.
  • Predictive Read-Ahead: The drive may determine patterns in disk access and prefetch data into cache for anticipated future reads.

The Role of the OS in Caching

The operating system also plays a major role in caching, with its own layer of RAM-based disk caching. It can reorder and batch disk operations before passing them to the hard drive. This means that even if a hard drive’s cache is relatively small, the OS can compensate by managing frequently accessed data in RAM, which is significantly faster than any onboard hard drive cache.

When Cache Doesn’t Help

While cache is incredibly useful for many workloads, there are scenarios where it does little to nothing:

  • Purely Sequential Writes: If you are writing large files that exceed the cache size, the drive will quickly bypass the cache and write directly to disk.
  • Heavy Random Workloads: If your workload is entirely random writes that do not benefit from coalescing or deferred writes, the cache provides minimal advantage.
  • Database Applications (Like MySQL): Many database engines already perform their own caching and optimizations, sometimes making CERTAIN TYPES OF CACHING on the hard drive’s cache redundant, and making other caching mechanisms more valuable (Why i research hard drive caching).

Final Thoughts

Hard drive cache is a critical but often misunderstood component. It plays a dynamic role in both read and write operations, helping to bridge the performance gap between slow spinning platters and fast system memory. While the actual caching algorithms remain proprietary, we can infer their behavior from real-world testing and performance characteristics.

For database-heavy workloads like MySQL, tuning both the database and disk caching mechanisms can lead to significant performance gains. Understanding when and how a hard drive’s cache is utilized can help in selecting the right drive for your specific use case.

12TB disk does not show up

I have been using an intel “D525mw” intel atom system as a network attached storage system for some time now, I have an extra SATA PCIe card (Silicon Image, Inc. SiI 3132) so that I can connect 4 disks, when the 12TB western digital disk (HGST HUH721212AL) is connected to the external SATA card, it does not show up, meaning, an “fdisk -l” does not bring it up !

So the next thing to do is swap the SATA connection with a different disk connected to the motherboard, and suddenly it works, amazing, but I need to know where the problem comes from

The first theory is that disks that are SFF-8447 compliant (rather than the old IDEMA standard) are not supported by this controller !