Learning machine learning – part 3

Image of the cover of the book Deep Learning with Python

Decided to take the plunge and purchase the Deep Learning with Python book and see what it has to offer. In prior posts (see Learning machine learning – part 1 & part 2) we were working with the cloud tutorials. This Part one is based on the book

It has a great introduction into deep learning which is a subset of machine learning. After what I know today, the Microsoft Azure session was more on traditional (statistical) machine learning and not deep learning.

 

Installing deep learning

In order to use the book, you need access to Keras, Python, Jupyter and a Keras backend (TensorFlow, Microsoft CNTK or Theanno).

I decided not to use any cloud solutions and rather install Python, Jupiter, TensorFlow and Keras on my MacBook. Although it probably would have been much easier (and more costly) to use any cloud solution.

I followed the directions on the installing TensorFlow website for the PIP install (you have to install a “virtual environment” and “PIP” first). The MacBook didn’t have a NVIDIA GPU so I needed to install the CPU version of TensorFlow.

But I had the hardest time running any of the book examples. Whenever I changed any command cell in a Jupyter notebook with Keras functionality in them (like adding a space to the end of an “import Keras” command line), it would throw a (module not found) error.

After days of web searching for what path is used for Jupyter notebook-iPython/Python imports (sys.path and PYTHONPATH) and where I should be importing Keras from (it’s not “~/ .keras”), I got nowhere closer to running anything.

I finally saw that I could directly install Keras (again, when I installed Tensorflow, it installed Keras as well) into my VENV. After I did that, everything worked. (I probably have one too many Keras environments, but who cares).

Finally getting the environment correct, I could now execute any command cells in a Jupyter notebook (with Keras functionality properly, well most of them anyways).

Jupyter notebooks for dummies…

It took me a while to figure out that the way you run a Jupyter notebook server is by issuing the command “jupyter notebook” (nowhere in the command’s help file, but can be found in Jupyter tutorials). That’s when I started to see the problems in the installation section above with my Keras installation.

Understanding Jupyter notebooks is non-trivial. Yes, I know it’s an interactive code and documentation environment. It’s sort of like BASIC on steroids with WORD functionality built in/escapeable into at any time.

First thing to understand is that when you open up a jupyter notebook, you haven’t executed anything yet. YES there are output lines in the notebook you just opened but NO, they aren’t from executing them under your client-server environment.

The output lines you see in the notebook is output from someone else’s execution run. So while they may look like they worked fine but they haven’t executed in your installation environment yet..

Also, when executing Jupyter notebook command cells, pay special attention to the In [?]: that’s shown to the right of every command cell.

When the ‘?’ is a number, like In:[12] that tells you what sequence (12th in the sequence) that (multi-line command cell) has been executed in and when the ‘?’ a “*”, like In[*], it says that the Jupyter notebook server is executing that command cell. 

Some command cells generate Out [?]: lines and others do not. So can’t use this to tell if something’s been executed or not. The only way to tell if some command cell has been executed is by seeing the In [n]: integer as n be incremented from the last command cell you executed. Of course you can execute command cells out of sequence if you wish.

Jupyter notebook coding/executing was weird as one who is more used to C, Java, and other coding languages and IDEs. A video tutorial on Jupiter notebooks would probably have helped here, but I couldn’t find one.

Running the examples

You can download all of the books current examples from the book’s website.

The book suggests you add model layers, subtract model layers and change the parameters of the number of nodes in a model as examples for you to try at home.

In general, doing so (once the environment was setup properly) seemed to work as desired. Adding layers didn’t seem to change the accuracy of the models, if anything it degraded it, and deleting layers didn’t help either. Ditto for adding or reducing node counts within a layer.

There’s a bunch of datasets that comes with Keras install used in the examples. Many examples have a first step where you modify this data so as to be more amenable to deep learning modeling.

For example, there’s a IMDB dataset that has film reviews. The film reviews are text files. But deep learning doesn’t work on text strings so you need to convert the text files into lists of integers. You do this by looking up each word in a word dictionary and substituting the index for each word in the review, generating an array (list) of integers.

This is all done through the NumPy package. It’s worth the time to become familiar with Python and probably NumPy. I took the verbal Python tutorial (but did nothing to learn NumPy).

Another example is a real estate prediction model that has 13 different parameters across 500 or so neighborhoods. The parameters are all different, some are distances, some %s, some pricing differences, etc. In order to perform deep learning on them, the example normalizes all of them, using distance from mean, in units of standard deviation.

There are other examples of data transformations as well. It seems that transforming your data into something amenable to deep learning is one part of the magic of deep learning.

Back to the book

Getting through chapter 3 of the book i- fairly straightforward when everything is set up properly. I found a iPad app (Juno) that could be used to connect to the Jupyter Server and it seemed to work once I found the proper command to use to start Jupyter (jupyter notebook –ip=”*”) and the proper Jupyter configuration parameters to use.

The examples are pretty self-documenting so you should be able to try out any of them on your own. The book adds great explanations on machine learning, deep learning and and the overall flow of how to approach a deep learning project.

Once you finish chapter 4 of the book you have all the tools one needs to tackle any deep learning project that you want to attack. You may need to read up on how to transform your data and you will probably be using one of the modeling techniques in one of the examples but it seems easy enough.

The rest of the book’s chapters (which I have yet to complete) deal with deep learning in practice and it’s in these chapters that you can learn some of the art of deep learning data science and model science..

~~~~

I ended up having fun with Jupyter notebooks, once I got them running with the iPad client in one hand and the book in the other. At the end of chapter 4, I startedto see some applications to my consulting business that might be interesting to model.

Using the Mac CPU was fast enough for the examples but I may have to tear down the crypto mine and use it as an AI server for my home network if I plan to tackle something with more data.

Wish me luck…

Learning machine learning – part 1

Saw an article this past week from AWS Re:Invent that they just released their Machine Learning curriculum and materials  free to the public. Google (Cloud Platform and elsewhere) TensorFlow,  (Facebook’s) PyTorch, and Microsoft Azure CNTK frameworks  education is also available and has been for awhile now.

My money is on PyTorch and Tensorflow as being the two frameworks most likely to succeed. However all the above use many open source facilities and there seems to be a lot of cross breeding across them. Both AWS ML solutions and Microsoft CNTK offer PyTorch and TensorFlow frameworks/APIs as one option among many others.  

AWS Machine Learning

I spent about an hour plus looking over the AWS SageMaker tutorial videos in the developer section of AWS machine learning curriculum. Signing up was fairly easy but I already had an AWS login. You also had to enroll/register for the course on your AWS login  but once that was through, you could access courses.

In the comments on the AWS blog post there were a number of entries indicating broken links and other problems but I didn’t have any issues. Then again, I didn’t start at the beginning, only looked at over one series of courses, and was using the websites one week after they were announced at Re:Invent.

Amazon SageMaker is an overarching framework that can be used to perform machine learning on AWS, all the way from gathering, analyzing and modifying the dataset(s), to training the model, to creating a inference engine available as an endpoint that can be used to perform the inferencing.

Amazon also has special purpose API based tools that allow customers to embed intelligence (inferencing) directly into their application, without needing to perform the ML training. These include:

  • Amazon Recognition which provides image (facial and other tagging) recognition services
  • Amazon Polly which provides text to speech services in multilple languages, and
  • Amazon Lex which provides speech recognition technology (used by Alexa) and together with Polly helps embed conversational interfaces into customer applications.

TensorFlow Machine Learning

In the past I looked over the TensorFlow tutorials and recently rechecked them out. I found them much easier to follow this time.

 

The Google IO 2018 video on TensorFlowGetting Started With TensorFlow High Level APIs, takes you through a brief introduction to the Colab(oratory),  a GCP solution that uses TensorFlow and how to use Tensorflow Keras, tf.data and TensorFlow Eager Execution to create machine learning models and perform machine learning.

 Keras on TensorFlow seems to be the easiest approach to  use machine learning technologies. The video spends most of the time discussing a Colab Keras code element,  ~9 lines, that loads a image classification dataset, defines a 1 level (one standard layer and one output layer), trains it, validates it and uses it to perform  inferencing.

The video also touches a bit on tf.data and TensorFlow Eager Execution but the main portion discusses the 9 line TensorFlow Keras machine learning example.

Both Colab and AWS Sagemaker use and discuss Jupyter Notebooks. These appear to be an open source approach to documenting and creating a workflow and executing Python code automatically.

GCP Colab is essentially a GCP-Google Drive based Jupyter notebook execution engine. With Colab you create a Jupyter notebook on google drive and interactively execute it under Colab. You can download your Juyiter notebook files and essentially execute them anywhere else that supports TensorFlow (that supports TensorFlow v1.7 or above, with Keras API support).

In the video, the Google IO   instructors (Josh Gordon and Lawrence Moroney) walk you through building a model to recognize handwritten digits and outputs a classification (0..9) of what the handwritten digit represents.

It uses a standard labeled handwriting to digits labeled data set, called the MNIST database of handwritten digits that’s already been broken up into a training set and a validation set. Josh calls this the “Hello World” of machine learning.

The instructor in the video walks you through the (Jupyter Notebook – Eager Execution-Keras) code that inputs the data set (line 2), builds a 1 level (really two layer, one neural net layer and one output layer) neural network model (lines 3-6), trains the model (line 7), tests/validates the model (line 8) and then uses it to perform an inference (line 9).

Josh spends a little time discussing neural networks and model optimizations and some of the other parameters used in the code above. He has a few visualizations of what this all means but for the most part, the code uses a simple way to build a neural net model and some standard optimization techniques for the network.

He then goes on to discuss tf.data which is an API that can be used to create machine learning datasets and provide this data to the neural net for training or inferencing.  Apparently tf.data has a number of nifty features that allow you to take raw data and transform it into something that can be used to feed neural nets. For example, separating the data into batches, shuffling (randomizing) the batches of data, pre-fetching it so as to not starve the GPU matrix multipliers, etc.

Then it goes into how machine learning is different than regular coding. And show how TensorFlow Eager Execution is really just like Python execution. They go through another example (larger) of machine learning, this one distinguishes between cats and dogs. While they use an open source Python IDE ,  PyCharm, to test and walk through their TF Eager Execution code, setting breakpoints and examining data along the way.

At the end of the video they show a link to a Google crash course on TensorFlow machine learning and they refer to a book Deep Learning with Python by Francois Chollet. They also mention a browser version of TensorFlow which uses Java Script and  your browser to develop, train and perform inferences using TensorFlow Keras machine learning.

~~~~

Never got around to Microsoft’s Azure training other than previewing some websites but plan to look over that soon.

I would have to say that the Google IO session on using TensorFlow high level APIs was a lot more enjoyable (~40 minutes) than the AWS multiple tutorial videos (>>40 minutes) that I watched to learn about SageMaker.

Not a fair comparison as one was a Google IO intro session on TensorFlow high level APIs and the other was a series of actual training videos on Amazon SageMaker and the AWS services you can use to take advantage of it.

But the GCP session left me thinking I can handle learning more and using machine learning (via TensorFlow, Keras, Eager Execution, & tf.data) to actually do something while the SageMaker sessions left me thinking, how much AWS facilities and AWS infrastructure services,  I would need to understand and use to ever get to actually developing a machine learning model.

I suppose one was more of an (AWS SageMaker) infrastructure tutorial  and the other was more of an intro into machine learning using TensorFlow wherever you wanted to execute it.

I think I’m almost ready to start creating and feeding a TensorFlow model with my handwriting and seeing if it can properly interpret it into searchable text. If it can do that, I would be a happy camper

Comments…

Photo credits: 

Screenshos from AWS Sagemaker series of tutorial video 1, 2, 3, 4 & 5, you may need a signin to view them

Screenshots from the Getting Started with TensorFlow High Level APIs YouTube video