Sunday, October 25, 2015

வாழ்க்கை தாரகை (life is a star)



இது பொதிகை மலை, அதில் பனியின் புகை
         (idhu podhigai malai, adhil paniyin pugai)
         (Podhigai Mountains, the mists smoke)
இந்த  இரவின் பகை, என் கைகள் துணை
         (indha iravin pagai, en kaigal thunai)
         (The nights enmity, my arms shall guard)
பாரெலாம், தேடியும், என் மனம், உன்வசம்  
         (paarelaam, thediyum, en manam, un vasam)
         (World over, I have searched, but my heart's with you)
என் மனம், தாண்டவம், காரணம், உன்னிடம்
         (en manam, thaandavam, kaaranam, unnidam)
         (My heart, dances, the reason, is with you)

என் கண் அசைத்தாலே  உந்தன் ஞாபகம்
         (en kann asaindhaalae, undhan ngyabagam)
         (Even when my eyes move, its you that I think of)

நீ காமன் கலை, அதை ஓதும் சிலை
        (nee kaaman kalai, adhai odhum silai)
        (You are, the Art of love[kaman, is the god of love], the sculpture that speaks it)
உன் இதழ்கள் தனை, என் இதழில் இணை
        (un idhalgal thanai, en idhalil innai)
        (Let your lips, join with mine)
மோகமோ, என் வசம், நாணமோ, உன் வசம்
        (mogamo, en vasam, naanamo, un vasam)
        (passion is in me, shyness in you)
தேகமோ, சேரணும், தாகமோ, தீரணும்
        (thegamo, seranum, thhagamo, theeranum)
        (Our Bodies, should meet, thirsts to be quenched)

உன்  கை அணைத்தாலே, உள்ளம் போர்க்களம்
        (un kai anaithaalae, ullam porkalam)
        (If your arms hug, my heart comes a battle field)

உன் பார்வை வலை, அது எந்தன் சிறை
        (un paarvai valai, adhu endhan sirai)
        (Your look is a net,  that becomes my prison)
நீ பேசும் குரல், நான் தூங்கும் இசை
       (nee paesum kural, naan thoongum isai)
       (The voice when you speak, is the music I sleep to)
காலத்தின், போக்கெனும், சோர்விலா, ஓர் விஷம்,
       (kaalathin, pokaenum, sorvila, or visham)
       (Time, its passage, is a tireless, poison)
அதை போக்கவும், தாக்குமே, தேவதை, உன் ரசம்
       (adhai pokkavum, thaakumae, thevathai, un rasam)
       (To eliminate, it fights, goddess,  your essence.)

என் வாய் இசைத்தாலே உந்தன்  வாசகம்
       (en vaai isaithaalae, undhan vaasagam)
       (When my mouth sings, its your story)

உன் இமைகள் கலை, அசைக்கும் பரதக்கலை,
      (un imaigalkalai, asaikkum barathakalai)
      (Your eyelids, are moved by the art of barathanatyam[south indian classical dance])
என் திசை சாய்க்குமே,  உன் ஒரு புன்னகை
      (en thisai saaikumae, un oru punnagai)
      (It changes my direction, your one smile)
காந்தமோ, உன் இடை, சாந்தமோ, உன் நடை
      (kaandhamo, un idai, saanthamo, un nadai)
      (magnet, your hips, calm, your steps)
கடல் பாறையோ, கால்களை, தேய மோதுதே, உன் நகை
      (kadal paarayo, kaalkalai, theya modhuthey, un nagai)
      (rock at sea, my feet, to erode they crash, your giggles)

அகம்  அடைந்தாலே, வாழ்க்கை தாரகை
      (agam adainthalae, vaalkai thaarakai)
      (When I win your heart, my life is a star)


Copyright (c) Sarvi Shanmugham

Tuesday, September 1, 2015

Rent Vs Buy Comparson


To Rent or Buy. A question that never seems to have answers, and happens to be center of many debates when family and friends get together. And I have a had more than my share of that conversation.

A friend of mine, Subbu, started this spread sheet analyzing the Home Buying decision a while back and shared it with me. I had a few more questions and what ifs that I wanted to understand. So I decided to expand on this. I am sure a lot of you have questions on this topic. This is an attempt to settle this by the numbers.

Found and even better one from New York Times here use that

Well not really, but to provide a tool to debate with hard numbers, different assumptions and projections for the future.

Below, you will find that document published. I have protected most of the formulae so you wont be able to modify them. But the fields marked green in the first page are editable to allow you to test out different assumptions about the future.

The original interactive google spread sheet document can be found can be found here.

The one thing this document does not cover is the emotional aspects of owning your own home, the memories and the ups and downs that might come with it. Attempting that would be an exercise in futility.

The analysis, starts with the following assumptions

  1. You have X amount of money for down payment, specified in Case 1, Down Payment Amount.
  2. You have a monthly allocation of Y amount, specified in Case 1,  Maximum Allocation
  3. Price home you want to buy
  4. Rent you expect to pay for the house if you were to rent.
From here we look at different scenarios, change in price of the house, change in rent, change in interest rates, Expected appreciation for the house as well as the expected rate of return for money you might invest outside of the house.

You can always customize the different cases/scenarios from there, like 
  1. delaying the buying decision for a few years if you think the house prices are too high
  2. model an increase in interest rates X year from now, if you delay buying the house.
  3. What if you decide to rent for the rest of your life.
  4. Different rates of appreciation for the house
  5. Different rates of return for your investment money.
  6. Different inflation rates for rent and other expense.
The following is just a read only view of the document. If you want to try modifying the scenarios and the values, you want to use the above link to the original google sheets, make your own copy or download it to Excel and you can make modifications and try out different scenarios.

Use the Inputs tab below the graph to try out different scenarios.

Have fun trying it out. If you have any questions or corrections on the formulae I have used or  suggestions to improve, do let me know.



Copyright (c) Sarvi Shanmugham

Friday, August 28, 2015

My Quantopian Notes

While working on Quantopian.com, I have had to do a fair bit of learning in various areas

  1. Learn Statistics
  2. Operating on Data in Pandas
  3. Trading Models
This is my attempt at documenting some of my learnings for my own and others.

Fundamentals Data Operations

Calculate Z-Score for specific fundamental fields in the Quantopian fundamentals DataFrame.


from scipy import stats
#Get rows you want
d=fund_df.loc[['pe_ratio','ev_to_ebitda']]
#Transpose it
d=d.T
#Drop NANs
d=d.dropna()
#Apply stats.zscore on that data
d=d.apply(stats.zscore)
#transpose it back
d=d.T
#rename the columns as needed
zscore=d.rename({'ev_to_ebitda':'ev_to_ebitda_zscore','pe_ratio':'pe_ratio_zscore'})

Calculate Z-Score for specific fundamental fields in the Quantopian fundamentals DataFrame, but do it group-wise by Morning Start Sector Code


from scipy import stats
#Transpose it
d=fund_df.T
#Drop NANs
d=d.dropna()
#Groupby Morning Star Code
d=d.groupby('morningstar_sector_code')
#Get rows you want
d=d[['pe_ratio','ev_to_ebitda']]
#Transform with stats.zscore on that grouped data
d=d.transform(stats.zscore)
#transpose it back
d=d.T
#rename the columns as needed
gzscore=d.rename({'ev_to_ebitda':'ev_to_ebitda_gzscore','pe_ratio':'pe_ratio_gzscore'})

Add these rows to back to fundamentals data


fund_df = pandas.concat([fund_df,zscore,gzscore]) 





Copyright (c) Sarvi Shanmugham

Wednesday, July 22, 2015

My EC2 Theano Keras Cluster Development Setup

Setting my EC2 environment to work on Machine Learning using GPU acceleration took a bit learning. Setting up EC2 was simple. I had to figure out how to do the following
  1. Setup a EC2 cluster of nodes
  2. Make sure there is a shared storage in EBS where the home directories are stored. Storage that will persist and be reused across multiple EC2 cluster starts and stops.
  3. Setup the the networking between them that so they can talk to each other and have passwordless SSH between nodes in the cluster
  4. Set them up to use their GPU, Theano and Keras
  5. Set the master up for GUI Desktop so for developer convenience
So I thought I should document my steps for my own use in the future. But hopefully this will help others who come looking for a guide, just as I was a few days ago.
This is my development setup. I plan on building Machine Learning Models and run them on the GPU and eventually run it on a cluster of GPU. So I am planning ahead to make sure I have all the pieces I need to do that development.

My Local Machine setup

  1. Make sure StarCluster is installed and is configured to use my EC2 account.
  2. That it can be used to create clusters in my region
  3. Create a volume where home directories will be stored and will persist across cluster starts/stops

My Node Setup

  1. Make sure ubuntu image being used is upto date and secure.
  2. EC2 GPU enabled StarCluster Ubuntu 14.04 image for cluster development
  3. An EC2 VPC and Security Group to bring the nodes in the cluster together and allow them to be accessible.
  4. Setup passwordless ssh access between nodes in the cluster
  5. Numpy, Scipy and other libraries
  6. Nvidia GPU tooling
  7. Python VirtualEnv
  8. Theano
  9. Keras
  10. EC2 Instance Setup
  11. XFCE Desktop with X2GO

My Master Setup

  1. All the steps from My Node Setup above
  2. A XFCE desktop connected with X2GO for GUI access to the master node.

Install StarCluster on your local machine, MACOS in my case

The next step is to create an EBS storage volume using a standard StarCluster enabled image, so that it is created, formated and made available.
http://star.mit.edu/cluster/docs/latest/installation.html
Follow the quick start steps at
http://star.mit.edu/cluster/docs/latest/quickstart.html
to make sure you can start a basic default cluster using
starcluster start mycluster
starcluster sshmaster mycluster -u ubuntu
starcluster terminate mycluster

EC2 VPC andSecurity Group Setup

Create your own VPC in the VPN menu, and enable the following
  1. VPC CIDR: Pick a range. Block sizes between /16 to /28. Example: 172.30.0.0/16
  2. DNS Resolution: Yes
  3. DNS Hostnames: Yes
  4. Classic Link: Yes
Add a Security Group, and do the following
  1. Give it a name
  2. Add the VPC to the security group
  3. Edit the Inbound Rules with
    1. TCL, ALL TCP, ALL, 0.0.0.0/0
    2. SSH(22), SSH, 22, 0.0.0.0/0
    3. ALL ICMP, ICMP(1), ALL, 0.0.0.0/0

Ubuntu 14.04 updated to confirm the Shell Shock bug is fixed

Create EC2 instance of type g2.2xlarge to start with and use the latest standard Ubuntu AMI and using the above VPC
1. Confirm linux kernel information
uname -mrs
cat /etc/lsb-release  
2. Confirm that the Shell Shock bug does not exist in this image, the following command should not say vulnerable.
env x='() { :;}; echo vulnerable' bash -c "echo this is a test"
3. Upgrade packages
sudo apt-get update
sudo apt-get upgrade
sudo apt-get dist-upgrade
4. Upgrade kernel as follows. Got to http://kernel.ubuntu.com/~kernel-ppa/mainline/ and pick the latest version of the kernel within the the same major version number.
mkdir kernel
cd kernel/
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19.8-vivid/linux-headers-3.19.8-031908_3.19.8-031908.201505110938_all.deb
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19.8-vivid/linux-headers-3.19.8-031908-generic_3.19.8-031908.201505110938_amd64.deb
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19.8-vivid/linux-image-3.19.8-031908-generic_3.19.8-031908.201505110938_amd64.deb
sudo dpkg -i *.deb
cd ..
rm -rf kernel
sudo shutdown -r now
5. Verify the shell shock bug does not exist
env x='() { :;}; echo vulnerable' bash -c "echo this is a test"
6. Create and save the AMI for future use.

Create StarCluster enabled Ubuntu 14.04 AMI

The next step is to create a StarCluster enabled image based of the updated Ubuntu 14.04 AMI we created in the previous step.
1. Create new EC2 instance with the AMI created above or continue from the last section.
2. Update apt-get sources.list to uncomment the lines that add multiverse as a source and update
sudo vi /etc/apt/sources.list
sudo apt-get update
3. Install nfs-kernel-server and dependencies along with portmap. Ubuntu 14.04 uses RPC bind, but we can install portmap and make it work. 5. Download sg6.tar.gz from the following link7. Create and save Cluster AMI. You now have an Ubuntu 14.04 Image that you can use with StarCluster
sudo apt-get install nfs-kernel-server nfs-common portmap
sudo ln -s /etc/init.d/nfs-kernel-server /etc/init.d/nfs
sudo ln -s /lib/init/upstart-job /etc/init.d/portmap
sudo ln -s /lib/init/upstart-job /etc/init.d/portmap-wait
4. Use the customized version scimage_14_04.py script from my fork of StarCluster
git clone https://github.com/sarvi/StarCluster.git
sudo python StarCluster/utils/scimage_14_04.py  
5. Download sge6.tar.gz from the following URL into /home/ubuntu/
https://drive.google.com/folderview?id=0BwXqXe5m8cbWflY1UEpnVUpScVozbFVuMERaOE9sMktrX1dFQmhCU0tLbnItUEo0VkZxZFE&usp=sharing
6. Untar it into /opt
cd /opt  
sudo tar -zxvf /home/ubuntu/sge6.tar.gz
cd
rm sg6.tar.gz
rm -rf StarCluster
7. Create and save the AMI that can now be used in a StarCluster configuration

Setup Numpy, Scipy, CUDA and other libraries

The next step is to install Numpy, Scipy, CUDA compilers and tools, etc. It is recommended to have the python virtualenv tooling to allow you have different custom virtual python environments for developing software. The following commands should be get them installed.  
sudo apt-get update
sudo apt-get -y dist-upgrade


sudo apt-get install -y gcc g++ gfortran build-essential git wget linux-image-generic libopenblas-dev python-dev python-pip python-nose python-numpy python-scipy


sudo apt-get install -y python-virtualenv
sudo wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_7.0-28_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1404_7.0-28_amd64.deb


sudo apt-get update
sudo apt-get install -y cuda


echo -e "\nexport PATH=/usr/local/cuda/bin:$PATH\n\nexport LD_LIBRARY_PATH=/usr/local/cuda/lib64" >> .bashrc


sudo shutdown -r now
Wait for the machine to reboot, relogin and continue installation and setup as follows
cuda-install-samples-7.0.sh ~/
cd NVIDIA\_CUDA-7.0\_Samples/1\_Utilities/deviceQuery
make
The following will make sure the CUDA was installed correctly and verify that the GPU is accessible and ready for use.
./deviceQuery
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 7.0, CUDA Runtime Version = 7.0, NumDevs = 1, Device0 = GRID K520
Result = PASS
cd ~/
rm -rf cuda-repo-ubuntu1404_7.0-28_amd64.deb
rm -rf NVIDIA\_CUDA-7.0\_Samples/1\_Utilities/deviceQuery
Create a virtual python environment that has to the global system packages installed, activate it by sourcing the activation script
virtualenv --system-site-packages theanoenv
source theanoenv/bin/activate
Install the Theano environment within the virtual environment. I do it this was so that I can work on theano itself to help fix bugs in the code. I can install it in an editable format and modify its code if needed. If you have no intention of modifying or updating theano, you can install it outside the virtualenv.
As a rule, I tend to install all tools that are bleeding edge, and stuff they depend on inside the virtual python environment.
pip install --upgrade --no-deps git+git://github.com/Theano/Theano.git
echo -e "\n[global]\nfloatX=float32\ndevice=gpu\n[mode]=FAST_RUN\n\n[nvcc]\nfastmath=True\n\n[cuda]\nroot=/usr/local/cuda" >> ~/.theanorc
Make sure the theano installation works and can use the GPU. This will acquire the GPU and start running theano tests on it. This will take a while. You can interrupt it once you know the GPU is being used and atleast some of the tests are running and passing
python -c "import theano; theano.test()"
Next pull Keras the modular machine learning library that builds on theano, so its sources are available to you. And pip install the code as editable, so that your changes to the keras sources can be run, debugged and test easily.
mkdir Workspace
cd Workspace
git clone https://github.com/fchollet/keras.git
pip install -e keras/
Next setup passwordless ssh between nodes in the cluster. For this you need to copy over the key file(*.pem) that you downloaded from amazon and that you use to ssh into your EC2 instance from your local machine. Copy this over to the instance you are working with. Then ssh-add this *.pem key
chmod 644 .ssh/authorized_keys
scp -i <your-public-encryption-key>.pem <your-public-encryption-key>.pem ubuntu@<public-ip-address-ec2-instance>:/home/ubuntu/
eval `ssh-agent`
ssh-add <your-public-encryption-key>.pem
Next verify that you can do a passwordless SSH, by trying an ssh into the same EC2 instance through its local IP address.
ssh ubuntu@<local-ip-address>
At this stage, you have everything installed and configure for working with GPU using theano and keras. This software configure can be used for masters and slaves in the cluster.
Create a slave AMI
But this would a good point to the go to the AWS menu and create and AMI, i.e. and Image based on the software and configuration of your current instance. You can launch future EC2 instances with the AMI that you create here. I call this a slave AMI since I would like to add a GUI desktop functionality into my master.
XFCE Desktop with X2GO for GUI access
I prefer to have a master machine running a GUI deskotp, with xterms to do my development on my master node. A setup that I can disconnect and connect back as needed. Where my development environment  is intact and allows me some continuity of development.
For this I setup and XFCE Desktop, that is known for its light foot print and X2GO for remote GUI access for its low bandwidth.
Add the X2Go Stable PPA
sudo add-apt-repository ppa:x2go/stable
sudo apt-get update
Install the XFCE packages and X2Go. Feel free to add other packages, but I purposely kept this selection small.
Installing "x2goserver-xsession" enables X2Go to launch any utilities specified under /etc/X11/Xsession.d/ , which is how a local X11 display or an XDMCP display would behave. This maximizes compatibility with applications.
sudo apt-get install xfce4 xfce4-goodies xfce4-artwork xubuntu-icon-theme firefox x2goserver x2goserver-xsession
Install X2Go Client and connect with it. In the X2Go Client "Session Preferences":
Specify "XFCE" as the "Session type."
If you have the SSH key in OpenSSH/PEM format, specify it in "Use RSA/DSA key for ssh connection".
If you have the ssh key in PuTTY .PPK format, convert it using PuTTYgen, and then specify it.
Or even better, just launch Pageant (part of the PuTTY suite,) load the .PPK key in Pageant, then in X2Go Client select "Try auto login (ssh-agent or default ssh key)".
Disable Screen Saver to minimize CPU usage

Create EBS storage to be mounted on all cluster nodes

The next step is to create an EBS storage volume using a standard StarCluster enabled image, so that it is created, formated and made available. And to move the home directory, /home/ubuntu on this storage. This will get mounted as /home in clusters and hence will act storage that will be persistent between clusters that created and destroyed.
Create EBS storage volume of desired size. Use an image-id from list show in the "starcluster listpublic" command. Specify a region where you want the volume to be created, in my case, us-west-2c where GPU nodes are available and cheap. 100 being the size in gigabytes
starcluster listpublic
starcluster createvolume --name=myhome --image-id=ami-04bedf34 100 us-west-2c
Note down the volume id that is created.
You will need to temporarily configure the star cluster config file to use the just created EBS mount as a shared storage at /myhome.
VOLUMES = myhome
..........................
[volume myhome]
VOLUME_ID = vol-595b0fbe
MOUNT_PATH = /myhome
This will mount the created EBS volume onto /myhome in a star cluster master and slave nodes
Now start a new cluster and log into the master node as user ubuntu
starcluster start mycluster
starcluster sshmaster mycluster -u ubuntu
Move the home directory to mounted storage
Then make sure the .gnupg directory is owned by user ubuntu, if not, change its ownership as follows. Then tar the ubuntu directory and save it into /myhome
sudo chown -R $(whoami) ubuntu/.gnupg
sudo tar -czvf /myhome/ubuntu.owner.tar.gz --same-owner ubuntu
cd /myhome
sudo tar -zxvf ubuntu.owner.tar.gz
Now change the starcluster configuration to now mount the shared EBS volume on /home instead of /myhome, terminate and restart the starcluster. You should now have an ubuntu home directory in EBS storage that is not is not lost when you start and restart your cluster.  
Create a master AMI
But this would a good point to the go to the AWS menu and create the master AMI, with GUI desktop functionality

Updates

I plan on keeping this page updated as my setup evolves and I refine the environment.

Copyright (c) Sarvi Shanmugham

Tuesday, March 24, 2015

Machine Learning - Inventing the Singularity

Machine Learning

Inventing the Singularity

Recently, the dangers of Artificial Intelligence(AI), have been raised by more than a few eminent thinkers of current times. Elon Musk even went as far as calling it inventing the Devil. Too dramatic, you might say? I have also heard from many bright minds in the middle of this Artificial Intelligence research and revolution, the people developing these machine learning technologies, people who defend the technology and dismiss the concerns being raised, claim that Elon Musk and Bill Gates don't understand the state of AI and have no idea what they are talking about.

I stand clearly and convincingly on the side that Artificial intelligence has the "potential" to be Evil, to become the devil, the singularity that consumes everything else. More powerful than the invention of the atom bomb.  I might sound like an alarmist at this point. But my conviction comes from an understanding of the nature, direction and potential of the current class of Machine Learning technology. So, let me explain.

Let me be very clear from the very beginning, I think machine learning has a huge potential to do good, be a force for good, do great things.  And we should continue to make great strides in all these areas.

But, as they say with great power comes great responsibility. It applies to machine learning. Its potential is limitless, and so are its dangers. And the dangers are even higher if we are completely oblivious to it. The following is just some insight towards, what makes it powerful, what makes it risky and unpredictable, recognizing its risks and dangers.

I have a background in artificial intelligence since my early days of engineering college, some 25 years ago. Though my day job is in all things Networking, I work for Cisco Inc, developing networking software that makes the internet do what it does, I have continued to pursue my passion for Machine Learning as a hobby throughout my adult life. So I have some insight about what I am saying below.

Very early incarnations of Artificial Intelligence were explicitly rule based, implemented in languages like PROLOG and LISP. You created rules of behavior to simulate intelligence. Then came the concept of learning these rules from raw data, hence the term Machine Learning. You would implement a generic learning algorithm that would feed on lots and lots of data and build rules from the data and then apply these rules on new and potentially unseen data. These were rigid systems that made black and white decisions.

But as we know, the real world is never truly black and white, its a million shades of grey, not just 50. Sorry. Couldn't resist that one.

Then came the statistical and probabilistic Machine Learning algorithms. People tried to develop mathematical models of the neurons in your brain and the network of neurons in your brain, also referred to as Neural Networks. These machines learned statistical and probabilistic rules and relationships from raw data and interactions with the world.

It is this new class of Learning Machines that I refer to, when I warn about the potential to be evil, the potential to be the singularity.

The current generation of these machines have become very proficient at things like recognizing speech, language, vision, etc. Self driving cars, that will be safer than humans, is already here. Its not science fiction anymore.

Ok. Machine Learning is becoming smarter by the day. And in some specific areas like vision, even smarter than humans.

But it is still a huge stretch to go from there to Evil, the Devil, the Singularity. Or is it?

Let me explain a bit more.

My conviction comes from a few key traits of this new class of machine learning algorithms.

  1. They are probabilistic decision engines.
  2. They learn their probabilistic rules from looking at large quantities of data and interacting with the real world. 
  3. Their over all decision making and interactions with the world are made from the collective decision making of millions of simple probabilistic rule nodes called neurons. 
  4. It is computationally infeasible to understand or test every potential input, interaction or decision that can be made by the collective decision making of these millions of individual probabilistic neurons.
  5. The recurrent version of these machines, are still in its infancy. These learn to make probabilistic  decisions, based not only from the current inputs, but also based on its own past probabilistic decisions and internal state. This makes these machines infinitely more complex, powerful and hard to understand what they have learnt.
  6. Larger these learning machines, larger its capacity to learn. And larger the sources of data the more they can learn from and the more complex to understand what it has learnt. 

The importance of the last 3 points above should not be under estimated and is the key to why I believe as I do.

Indulge me a for bit more. Let me elaborate on these points.

These new statistical and probabilistic machine learning algorithms don't learn explicit rules of interaction. They are NOT a collection of deterministic rules like, if this, do that. They are a huge collection of tiny probabilistic decision making neuron models that learn to make probabilistic decisions. And the intelligence of the whole machine comes from the probabilistic learning and decision making of these millions of these little nodes. 

Put it in relatively simple terms, imagine a crowd of a billion "simple", not so smart people, each voting "probabilistically", based on what they directly see or hear from the real world or from their neighbors, to collectively make every decision they make together. Ring a bell? Can you really understand the decisions they will make? Can you really know how they will decide under every condition, input or interaction? Can you really say with any certainty they the will not be evil or make catastrophic decisions. Think of every war, genocide or man made catastrophe in history before answering that question.

It is not possible to understand in real concrete terms all the rules that have been learnt by this huge collection of tiny learning machines and predict all possible outcomes that might be generated by them. Ok, its possible, but only theoretically. The computational requirement to completely understand a very large machine learning model with billions of these neurons would be exponentially higher.

This is true for the simple feed forward learning models which is the predominant machine learning technology now. Once you step into the recurrent version of these models, it becomes a dynamical system that is far more powerful and complex. They have historically been far more complex and difficult to train or understand what it has learnt. Its an area of Machine Learning that is still in its infancy, but progress is being made every year, and they are getting more powerful with time.

Their power and complexity comes from the fact that these probabilistic units or neurons not only make their decisions based on data inputs from the real world and/or the decisions of other neurons at the current time, but take into consideration the probabilistic decisions and state of its neurons from earlier times. This means many things, each good and bad depending on the scenario. Now, not only can they make good or bad decisions, they can continue to make these good or bad decisions long after the stimulus or input has ceased to exist.  A bad decision doesn't just affect the current choice, but can continue to affect future decisions. Continue to improve on a good decision or worsen a bad decision. Learn from good examples just as well from bad examples. Make choices based on learnt probabilities of choices, learnt probabilities of right and wrong, learnt probabilities of good and bad.

A recurrent machine learning model, is one step closer to self conscious thought and decision making.

This might might sound more more like how we humans think. And you will be absolutely right. It is. That is exactly what current generation machine learning is trying to model. But the size and complexity of current generation neural networks is similar to that of the brain of a fly. But its growing and evolving every day. Now imagine something the size of a 1000 human brains, that can learn from every piece of data available to it from all across the world. And what it learns is based on what its taught, what data it sees and the conclusions it gleans from it. This is not a matter of if, any more, but when.

It took 16000 interconnected machines to recognize cats in a video, in an experiment by Google.  A year later it took 16 machines with GPU(graphics processing units) that are way smarter at arithmetic. So computing is not the barrier. The algorithms keep improving every year, learning machines officially have supervision i.e. they smarter than humans at least in some limited vision tasks and will be driving cars soon. In the machine learning world, if computing and the algorithm is the rocket, the data needed for them to learn from, is the rocket fuel, as one eminent machine learning scientist, Andrew Ng, put it in last weeks GPU Technology conference. And data is now available everywhere in spades, petabytes of it and growing exponentially.

In a such dynamic, probabilistic model, its learning driven by huge amounts of data, its decisions governed by a potentially long history of inputs and previous decisions, its learning potential and consciousness is limited only by the amount of computing power and data we throw at it.  For these systems, any choice or decision, even a murderous, destructive or even catastrophic one, does not have a zero probability, it just might have a very low probability.

And even that, we can't be sure of, because we truly can't understand what rules it has learnt, what their probability is or the exact sequence of events that could trigger that decision. That is truly what makes it both powerful and dangerous. You understand and control a nuclear reaction. But how do you control and risk manage something that is just as powerful, but you truly can't understand.

To summarize the big points of concerns to watch for

  1. Recurrent Learning Machines are closer to inventing self conscious intelligent thought than anything invented so far, and at this point we don't really even understand how close.
  2. Their level of intelligence is only limited by the amount of computing and data we throw at it. They certainly have the potential to surpass humans.
  3. It is theoretically not possible to understand all the rules it has learnt and how the rules translates to events in real life. Trying to understand the probability of every outcome for every possible sequence of events can be exponentially more complex. So this creates an inherent level of uncertainty and hence risk to the decisions that they make.
  4. The universal nature of this technology means that it can be applied to learn from a wide range of data and solve many problems. This means that it can eventually control very vitals parts of society and make decisions that are life or death. A technology that can become an integral part of life.
There isn't another technology in history that really compares to AI in these aspects and is pretty unique.

And let me be clear, the potential for a AI singularity is just the final big ticket catastrophic risk. There are many other intermediate risks and uncertainty, strewn all along that path as mine fields, that needs careful thought and consideration to navigate. They get riskier as the system gets more intelligent and intertwined with every aspect of our life.   

Now are we there yet, clearly and most definitely, not yet. None of the machine learning algorithms that exist today can do, what I claim they can do eventually. The technology is still in its infancy and its not clear how close we are to machines of such intelligence. We could be decades away or just a simple algorithm tweak away. But the key attributes I list about these learning machines all exist and are currently evolving. The direction they are evolving is clearly headed there.

So when truly smart people like Elon Musk and Bill Gates sound a warning, everyone should listen. Not to shut down research in this area all together.
But to start thinking about its dangers, and the needed checks and balances.

Make sure we recognize Pandora's Box from a mile away, let alone trying to open it. But have no doubt in your mind, that it truly has the potential to be Pandora's Box.