The death of the internet

Let me throw a few of concepts we’ve been hearing about more & more lately:

  • metered bandwidth
  • end of net neutrality
  • content censorship
  • protocol restrictions
  • geographic restrictions
  • wiretapping
  • deep packet inspection
  • malware becoming crimeware
  • dataleaks
  • DDoS
  • internet kill switch

The way that we used to see the internet as an unrestricted web of information is changing rapidly. And it looks like the free ride is coming to an end.

Corporations want to dictate our internet usage, politicians don’t understand the issues of a technology from the next generation; and if they do, lobbyist money has a strong convincing power. And quite frankly your average user has no clue either. What was once a free and unrestricted flow of information is quickly becoming a metered and port/site/protocol restricted happy network.

references:

Traffic discrimination & Net Neutrality

Comcast’s P2P throttling suit

What was revolutionary about the internet was its lack of boundaries, the world was connected. Since then the marketing & licensing geniuses have caught on to the fact that it is possible to restrict content by geographic location. Like regions on DVDs you now cannot consume certain media in certain regions. It is a travesty to the human accomplishment that is the internet and inevitably leads to the absurdity that it is easier to consume pirated content than legal one.

Organized crime also has caught on, the obnoxious malware & viruses that were once spreading for fame or installing dumb toolbars are now becoming very targeted at committing crimes. From harvesting financial information to generating DDOS attacks. A black market of stolen information and network hitmen is emerging on an internet that many companies handling your data do not understand. Viruses much like biologic organisms are becoming polymorphic with self defense mechanisms. Their technological advancement clearly shows funded work as opposed to the classic image of the basement hacker we all have ingrained in our heads.

references:

Zeus botnets specialized in harvesting financial data

Researchers hijack control of the Torpig botnet for 10 days and recover 70 GB of stolen data from 180,000 infections

Governments are starting to play their silly international politics game on this new field, releasing cyber attacks against one another. The amount of information & critical infrastructure facing the great network is making it a strategic field of military and intelligence importance. It is clear that the network in its current state of international openness is an issue to government interests, and we can fully expect to find cyber borders erected in the near future, not unlike the great firewall of China even though this last example has other applications. Applications that pertain to opinion control via censoring, China isn’t the only country doing that, Australia is pretty good at it. And the U.S. is working on creating a presidential “interet kill switch”, you know just in case people here get sick enough of 2 everlasting wars and 4th amendment tramplings to take the streets. Egypt has just done it, they shut down internet and cell phone communications during their 2011 protests.

references:

Stuxnet’s specific targeting of Iran’s SCADA controled systems

The Great Firewall of China

Australia’s intenet censorship

Obama’s internet kill switch

How Egypt shut down the internet

At a time when Wikileaks is putting to shame governments and corporations, more controls are inevitable.

So what’s next?

Computers and network devices have become increasingly powerfull. So much so that this blog you’re reading is instantiated on a 8 years old server sitting on a fridge behind a home DSL. Besides computing & networking power, something else has been growing that you might have heard about: social networks.

I think that one day, a couple of geeks will be tired of the state of the internet and will throw a home-made link between their houses to share what they want when they want without getting advertised, wiretapped, datamined or attacked. This can currently be done with long range wireless devices (WiMAX) or even by adding a layer to the current infrastructure (think VPN).  Soon a third geek friend will want in, and provided that he is trusted by the founders, he’ll get in. After a while, adding friends of friends will become too far out of reach for the founders to decide and they will implement a social reputation based system for dealing with users.

And that’s it, you have a social network (at the strictest send of the term) that is growing & correcting itself based on reputation. This will of course be completely decentralized (unlike the internet) which means you will be relaying information for individuals you don’t know, hence the criticality of its reputation element.

This network will eventually be overrun by corporate, mafia & government interests finding ways to abuse the reputation systems, it will slowly die and be replaced by another couple of geeks down the road.

The end.

Markov chains based random word generation

Markov chains are used primarily in Natural Language Processing for part-of-speech tagging. Corpora are studied to establish the construction of sentences. This is a very powerful algorithm that can also be used to generate new material (words, text, et cetera). In this first post I will talk about generating words.

  • How it works

Given a corpus, letter patterns are studied at different depths. For depth one, the probability of a letter following another is established. For depth two the probability of a letter following a sequence of 2 letters is established. The same goes for greater depths. The result of all this studying is a table of probabilities defining the chances that letters follow given sequences of letters.

When the time comes to generate words, this table of probabilities is used. Say that we need to generate a word at depth 2, we seed the word with 2 null letters, then we look in the table for all the letters that can follow a sequence of 2 null letters and their associated probabilities. Their added probabilities will be 1 obviously. We generate a random number between 0 and 1 and use it to pick which following letter will be chosen. Let’s say that the letter “r” was chosen. Our generated word is now comprised of “null” and “r”. We now use this sequence as the basis for our next letter and look for the letters that can follow it. We keep going until an null letter is reached, signifying the end of the generated word.

Here’s a sample of a probability table:

  • Benefits of this algorithm

It will generate words that do not exist but respect the essence of the corpus it’s based on. This is really cool for example to generate words that sound English but aren’t (say for random passwords that can be pronounced/remembered). We could also make a list of all the cool words (motorcycle, sunglasses, racing, et cetera) and extract their essence to generate maybe a product name that is based on coolness :).

Go ahead and play with it: