Choose color scheme

Category Archives: all out geekery

  • Markov chains based random text generation

    We’ve already seen how to use Markov chains to generate random words that are based on the essence of a previously analyzed corpus. Well the exact same algorithm can be applied to text. The base entities become words instead of letters. I make punctuation be part of the entities, this way, sentence flow becomes part of the extracted statistical essence.

    Feel free to send me ideas of cool corpora to analyze.

    You can play with it here:

  • The static experiment – all done!

    The little static box is up & running, Akrin has been fully migrated to it. I absolutely love that there are no moving parts in there. The running temperature of the CPU is what worried me the most since nothing is making the air flow in & out of there. At the heat of heavy processing, the temperature of the CPU doesn’t go above 67 degrees Celsius. That’s pretty all right! Quite frankly this little box handles stress very well but my point of reference is so obsolete I’m bound to be impressed :).

    Picture bellow, the new & old Akrin together for a soul transfer

    So there you have it, a kick ass little box discrete to the eyes & ears.

  • The static experiment – WTF Habey?

    The hardware showed up! So I get busy installing the RAM and the SSD. Habey in all its generosity included a SATA data cable with its barebone server. This is cool I guess, I mean I already have a bunch and hard disks always have cables but I’ll take it.

    I proceed to start hooking the SSD when I realize that there are no SATA power slots anywhere.

    Do you see anything?

    The problem is that apparently I’m the only person who ever bought one of these systems. There is literally no information available on any site (including www.habeyusa.com) on how to power your hard drives. Even though it has an IDE slot, there is no 4 pin Molex power available either, so no luck hijacking one of these for the SATA SSD.

    After careful examination of the motherboard, there is one slot that’s labeled “POWOUT1”. It’s a slot whose shape I haven’t seen for ages. I hope you’re sitting as you’re about to read this: it is shaped for 3.5″ floppy disk drive power. And that’s the only power that seems tap-able for hard drives. Much research on the web yields many 4 pin Molex to SATA cable converters. Eventually some Floppy power to to 4 pin Molex. Ultimately I found just the cable I needed.

    You’re reading right; SATA Power 15pin to FDD (as in Floppy Disk Drive) power 4 pin…

    Habey thought to include a standard SATA data cable but not their weird ass power equivalent. And it you look carefully, SATA power cables have 5 cables, the picture above has only 4. The 3 Volts cable has just been gotten rid of. Doesn’t this affect functionality?

    Well fuck everything, I’m not waiting 5 more days for a silly cable. Thankfully we have a master hardware tinkerer at work, and after verifying the voltage of the slots on the motherboard (to verify that it was indeed FDD power), we cannibalized a couple of old power supplies to come up with a Frankenstein cable.

    TADAAAAAA!!


    And it works perfectly. Seriously Habey: better labeling, a motherboard manual (online or paper) or a weird ass cable included would have been nice.

    Tomorrow we’ll stress test the box and it’d better take the beating without crashing.

    Thanks to playtool.com for their very helpful resource.

  • The static experiment

    Akrin is an server whose soul has been through many iterations of old hardware. It never needed much resources so I easily got away with $30 PCs bought at the university surplus.

    It currently resides on an aged Pentium IV with just 500MB of RAM and some old IDE hard drive. With the addition of more & more projects (recently: CCTV installation, new sites such as www.blindspotis.com, database intensive Markov chains generation), it’s close to maximum capacity and could use an upgrade.

    More than new hardware I’ve decided it was time to change how computing was done at home.  And I’m going for no moving parts. This means no fans, no spinning disks and no moving heads.

    What are the advantages?

    • no vibrations, not an iota of noise
    • no jet take off sound when running heavier computation
    • no malfunctioning fans that could result in a fire hazard
    • supposedly hardware that is more resistant to shocks
    • fanless means less powerful which in terms means less power consumption

    Here’s what I ordered:

    It doesn’t come with RAM or a hard drive. I like the small form factor and the fact that it has 2 NICs. This means it can easily be recycled in a nice router should the experiment fail.

    • Some RAM (DDR2 SODIMM), I went for the max 2GB that the EPC-6542 will support. ($45) link
    • A 2.5″ SATA II 128GB solid state disk (SSD) ($223 – $75 mail in rebate = $148) link

    Now SSDs are pretty expensive compared to traditional hard drives so it is a high price to pay for no moving parts. But they are also much faster, and because of the CCTV cams recording  24/7, I think that the I/O speed gain will have a tremendous overall effect on the server.

    Akrin will soon run on $423 of new hardware, this is unprecedented :)

    To be continued…

  • Markov chains based random word generation

    Markov chains are used primarily in Natural Language Processing for part-of-speech tagging. Corpora are studied to establish the construction of sentences. This is a very powerful algorithm that can also be used to generate new material (words, text, et cetera). In this first post I will talk about generating words.

    • How it works

    Given a corpus, letter patterns are studied at different depths. For depth one, the probability of a letter following another is established. For depth two the probability of a letter following a sequence of 2 letters is established. The same goes for greater depths. The result of all this studying is a table of probabilities defining the chances that letters follow given sequences of letters.

    When the time comes to generate words, this table of probabilities is used. Say that we need to generate a word at depth 2, we seed the word with 2 null letters, then we look in the table for all the letters that can follow a sequence of 2 null letters and their associated probabilities. Their added probabilities will be 1 obviously. We generate a random number between 0 and 1 and use it to pick which following letter will be chosen. Let’s say that the letter “r” was chosen. Our generated word is now comprised of “null” and “r”. We now use this sequence as the basis for our next letter and look for the letters that can follow it. We keep going until an null letter is reached, signifying the end of the generated word.

    Here’s a sample of a probability table:

    • Benefits of this algorithm

    It will generate words that do not exist but respect the essence of the corpus it’s based on. This is really cool for example to generate words that sound English but aren’t (say for random passwords that can be pronounced/remembered). We could also make a list of all the cool words (motorcycle, sunglasses, racing, et cetera) and extract their essence to generate maybe a product name that is based on coolness :).

    Go ahead and play with it: