BogoToBogo
  • Home
  • About
  • Big Data
  • Machine Learning
  • AngularJS
  • Python
  • C++
  • go
  • DevOps
  • Kubernetes
  • Algorithms
  • More...
    • Qt 5
    • Linux
    • FFmpeg
    • Matlab
    • Django 1.8
    • Ruby On Rails
    • HTML5 & CSS

Flask with Embedded Machine Learning I : Serializing with pickle and DB setup

Python-Flask.png




Bookmark and Share





bogotobogo.com site search:


Introduction

Machine learning models in web applications include spam detection in submission forms, shopping portals, search engines, recommendation systems for media, and so on.

Throughout the series of articles we'll see how to embed a machine learning model into a web application that not only makes classification but also learns from data in real-time.

Basically, via this series, we can extend our knowledge of machine learning theory. We are going to see how to serialize a model after training and how to load it for later use cases. We also create a SQLite database and create a Flask web application that our movie classifier available to the user.







Serializing fitted scikit-learn estimators

Because training a machine learning model requires expensive computational resources, we don't want our model to learn the model from the training data all over again.

Fortunately, Python's pickle allows us to serialize and de-serialize object so that we can save our classifier in its current state and reload it if we want to classify new samples without needing to learn the model from the training data again.

In the following code, we create a movieclassifier directory where we will later store the files and data for our web application. Within this movieclassifier directory, we create pkl_objects subdirectory to save the serialized Python objects to our local drive.

pickle-dump.png

Using pickle's dump() method, we then serialized the trained logistic regression model as well as the stop word set from the NLTK library so that we can avoid installing the NLTK vocabulary on our server.

Here is the API definition of the dump() method:

pickle.dump(obj, file, protocol=None, *, fix_imports=True)

The dump() method writes a pickled representation of obj to the open file object file.

In the code, we used the "wb" argument within the open function: ('b') - binary mode for pickle.





movieclassifier/vectorizer.py

Since HashingVectorizer does not need to be fitted, we don't need to pickle it.

Instead, we can create a new Python script file, from which we can import the vectorizer into our current Python session. Here is the code (movieclassifier/vectorizer.py):

vectorizer-py.png

Now we have pickled the Python objects, and let's deserialize and test if it is really working.





Deserializing

Now we may want to test if we can deserialize the objects without any error.

Let's restart Jupyter Notebook kernel from a terminal, navigate to the movieclassifier directory, start a new session and execute the following code to verify that we can import the vectorizer and unpickle the classifier:

ImportVectorizerUnpickleClassfier.png

Once we have successfully loaded the vectorizer and unpickled the classifier, we can use these objects to pre-process document samples and make predictions about their sentiment.


Since our classifier returns the class labels as integers, we defined a simple Python dictionary to map those integers to their sentiment:

SentimentLabelDict.png

We then used the HashingVectorizer to transform the simple example document into a word vector $X$:

XVector.png

Finally, we used the predict method of the logistic regression classifier to predict the class label as well as the predict_proba() method to return the corresponding probability of our prediction.

PredictionCode.png

Note that the predict_proba() method returns an array with a probability value for each unique class label.

Since the class label with the largest probability corresponds to the class label that is returned by the predict(), we used the np.max() to return the probability of the predicted class.


Here are additional sentiment analysis when we changed the example with more negative comments input to the classification prediction:

SentimentNegatives.png

Favorable review:

SentimentFavorableReview.png



SQLite setup

Now we want to set up a SQLite database to collect app user's feedback that we can use to update our classification model.

SQLite is a self-contained SQL database engine, and it is ideal for a simple web applications.

There is already an API in the Python standard library, sqlite3, which allows us to work with SQLite databases.

With the following code, we create a connection (conn) to an SQLite database file by calling sqlite3's connect() method, which creates the new database file reviews.sqlite in the movieclassifier directory:

SQLiteSetup.png

Next, we created a cursor via the cursor() method, which allows us to traverse over the database records.

By calling execute(), we created a new database table, review_db. We used this to store and access database entries. We also created three columns in this database table: review, sentiment, and date.

We're using these to store example movie reviews and respective class labels (sentiments).

Using the DATETIME('now') command, we also added date and timestamps to our entries.

In addition to the timestamps, we used the question mark(?) to pass the movie review texts and the corresponding class labels (1 and 0) as positional arguments to the execute() method as members of a tuple.

Finally, we called the commit() to save the changes we made to the database and closed the connection via the close() method.


To check if the entries have been stored in the database table correctly, we need to reopen the connection to the database and use the SELECT command to fetch all rows in the database table that have been committed between the beginning of the Nov.1, 2016 and today:

Reopen-SQL-To-Check-Stored-Review.png



SQLite manager on FireFox

We may want to use the free Firefox browser plugin SQLite Manager (https://addons.mozilla.org/en-US/firefox/addon/sqlite- manager/):

ScreenshotSQLiteManager.png

After restarting the browser, select "SQLite Manager" under "Tools" topmenu.

ConnectToDB.png
Open-movieclassifier.png
Open-reviews.sqlite.png
review_db_sentiment.png



SQLite shell

We can access sqlite via shell.

To access our reviews.sqlite:

$ sqlite3 reviews.sqlite

Then, we'll get a sqlite prompt. To list tables, we can type .tables:

sqlite> .tables

To see what's in the table:

sqlite> select * from review_db;
I love this movie|1|2016-11-03 15:55:12
I disliked this movie|0|2016-11-03 15:55:12
The movie is the best ever, and the most intriguing film.
|1|2016-11-04 16:07:21
The movie is the best ever, and the most intriguing film.|1|2016-11-04 16:37:22
This film served as great entertainment with its colorful cast and numerous plot twists.

The gorgeous action scenes and impressive dialogue really held the audience's attention and kept them on the edge of their seats.|1|2016-11-05 05:43:22
This movie is the best ever.|1|2016-11-05 06:13:37
Overall, this is a great movie with a mix of a ton of laughs and a love story all rolled into one. 
If you're looking for a pick-me-up or to laugh hysterically, this is undoubtedly the movie for you. |1|2016-11-05 19:36:00
sqlite> 

To list databases:

sqlite> .databases
seq  name             file                                                      
---  ---------------  ----------------------------------------------------------
0    main             /home/sfvue/MySites/ahaman/reviews.sqlite 

We can exit from sqlite with the command Ctrl + D.





Github Jupyter Notebook source

Github Jupyter notebook is available from
FlaskAppWithEmbeddedMachineLearningSentimentAnalysis.ipynb





Github source

Source is available from ahaman-Flask-with-Machine-Learning-Sentiment-Analysis





Refs

Python Machine Learning, Sebastian Raschka





Next

Flask with Embedded Machine Learning II : Basic Flask App









Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization

YouTubeMy YouTube channel

Sponsor Open Source development activities and free contents for everyone.

Thank you.

- K Hong








Flask



Deploying Flask Hello World App with Apache WSGI on Ubuntu 14

Flask Micro blog "Admin App" with Postgresql

Flask "Blog App" with MongoDB - Part 1 (Local via Flask server)

Flask "Blog App" with MongoDB on Ubuntu 14 - Part 2 (Local Apache WSGI)

Flask "Blog App" with MongoDB on CentOS 7 - Part 3 (Production Apache WSGI )

Flask word count app 1 with PostgreSQL and Flask-SQLAlchemy

Flask word count app 2 via BeautifulSoup, and Natural Language Toolkit (NLTK) with Gunicorn/PM2/Apache

Flask word count app 3 with Redis task queue

Flask word count app 4 with AngularJS polling the back-end

Flask word count app 5 with AngularJS front-end updates and submit error handling

Flask word count app 0 - Errors and Fixes

Flask with Embedded Machine Learning I : Serializing with pickle and DB setup

Flask with Embedded Machine Learning II : Basic Flask App

Flask with Embedded Machine Learning III : Embedding Classifier

Flask with Embedded Machine Learning IV : Deploy

Flask with Embedded Machine Learning V : Updating the classifier

Flask AJAX with jQuery

Flask blog app with Dashboard 1 - SignUp page

Flask blog app with Dashboard 2 - Sign-In / Sign-Out

Flask blog app with Dashboard 3 - Adding blog post item

Flask blog app with Dashboard 4 - Update / Delete

Flask blog app with Dashboard 5 - Uploading an image

Flask blog app with Dashboard 6 - Dash board

Flask blog app with Dashboard 7 - Like button

Flask blog app with Dashboard 8 - Deploy

Flask blog app with Dashboard - Appendix (tables and mysql stored procedures/functions

Sponsor Open Source development activities and free contents for everyone.

Thank you.

- K Hong






Python tutorial



Python Home

Introduction

Running Python Programs (os, sys, import)

Modules and IDLE (Import, Reload, exec)

Object Types - Numbers, Strings, and None

Strings - Escape Sequence, Raw String, and Slicing

Strings - Methods

Formatting Strings - expressions and method calls

Files and os.path

Traversing directories recursively

Subprocess Module

Regular Expressions with Python

Regular Expressions Cheat Sheet

Object Types - Lists

Object Types - Dictionaries and Tuples

Functions def, *args, **kargs

Functions lambda

Built-in Functions

map, filter, and reduce

Decorators

List Comprehension

Sets (union/intersection) and itertools - Jaccard coefficient and shingling to check plagiarism

Hashing (Hash tables and hashlib)

Dictionary Comprehension with zip

The yield keyword

Generator Functions and Expressions

generator.send() method

Iterators

Classes and Instances (__init__, __call__, etc.)

if__name__ == '__main__'

argparse

Exceptions

@static method vs class method

Private attributes and private methods

bits, bytes, bitstring, and constBitStream

json.dump(s) and json.load(s)

Python Object Serialization - pickle and json

Python Object Serialization - yaml and json

Priority queue and heap queue data structure

Graph data structure

Dijkstra's shortest path algorithm

Prim's spanning tree algorithm

Closure

Functional programming in Python

Remote running a local file using ssh

SQLite 3 - A. Connecting to DB, create/drop table, and insert data into a table

SQLite 3 - B. Selecting, updating and deleting data

MongoDB with PyMongo I - Installing MongoDB ...

Python HTTP Web Services - urllib, httplib2

Web scraping with Selenium for checking domain availability

REST API : Http Requests for Humans with Flask

Blog app with Tornado

Multithreading ...

Python Network Programming I - Basic Server / Client : A Basics

Python Network Programming I - Basic Server / Client : B File Transfer

Python Network Programming II - Chat Server / Client

Python Network Programming III - Echo Server using socketserver network framework

Python Network Programming IV - Asynchronous Request Handling : ThreadingMixIn and ForkingMixIn

Python Coding Questions I

Python Coding Questions II

Python Coding Questions III

Python Coding Questions IV

Python Coding Questions V

Python Coding Questions VI

Python Coding Questions VII

Python Coding Questions VIII

Python Coding Questions IX

Python Coding Questions X

Image processing with Python image library Pillow

Python and C++ with SIP

PyDev with Eclipse

Matplotlib

Redis with Python

NumPy array basics A

NumPy Matrix and Linear Algebra

Pandas with NumPy and Matplotlib

Celluar Automata

Batch gradient descent algorithm

Longest Common Substring Algorithm

Python Unit Test - TDD using unittest.TestCase class

Simple tool - Google page ranking by keywords

Google App Hello World

Google App webapp2 and WSGI

Uploading Google App Hello World

Python 2 vs Python 3

virtualenv and virtualenvwrapper

Uploading a big file to AWS S3 using boto module

Scheduled stopping and starting an AWS instance

Cloudera CDH5 - Scheduled stopping and starting services

Removing Cloud Files - Rackspace API with curl and subprocess

Checking if a process is running/hanging and stop/run a scheduled task on Windows

Apache Spark 1.3 with PySpark (Spark Python API) Shell

Apache Spark 1.2 Streaming

bottle 0.12.7 - Fast and simple WSGI-micro framework for small web-applications ...

Flask app with Apache WSGI on Ubuntu14/CentOS7 ...

Selenium WebDriver

Fabric - streamlining the use of SSH for application deployment

Ansible Quick Preview - Setting up web servers with Nginx, configure enviroments, and deploy an App

Neural Networks with backpropagation for XOR using one hidden layer

NLP - NLTK (Natural Language Toolkit) ...

RabbitMQ(Message broker server) and Celery(Task queue) ...

OpenCV3 and Matplotlib ...

Simple tool - Concatenating slides using FFmpeg ...

iPython - Signal Processing with NumPy

iPython and Jupyter - Install Jupyter, iPython Notebook, drawing with Matplotlib, and publishing it to Github

iPython and Jupyter Notebook with Embedded D3.js

Downloading YouTube videos using youtube-dl embedded with Python

Machine Learning : scikit-learn ...

Django 1.6/1.8 Web Framework ...









Contact

BogoToBogo
contactus@bogotobogo.com

Follow Bogotobogo

About Us

contactus@bogotobogo.com

YouTubeMy YouTube channel
Pacific Ave, San Francisco, CA 94115

Pacific Ave, San Francisco, CA 94115

Copyright © 2024, bogotobogo
Design: Web Master