BogoToBogo
  • Home
  • About
  • Big Data
  • Machine Learning
  • AngularJS
  • Python
  • C++
  • go
  • DevOps
  • Kubernetes
  • Algorithms
  • More...
    • Qt 5
    • Linux
    • FFmpeg
    • Matlab
    • Django 1.8
    • Ruby On Rails
    • HTML5 & CSS

Docker - ELK 7.6: Elasticsearch

Docker_Icon.png elasticsearch.png




Bookmark and Share





bogotobogo.com site search:





Note

Elastic Stack docker/kubernetes series:

  • Docker - ELK 7.6 : Elasticsearch
  • Docker - ELK 7.6 : Filebeat
  • Docker - ELK 7.6 : Logstash (All in One)
  • Docker - ELK 7.6 : Kibana
  • Docker - ELK 7.6 : Kibana II
  • Docker - ELK 7.6 : Elastic Stack with Docker Compose
  • Docker - Deploy Elastic Cloud on Kubernetes (ECK) via Elasticsearch operator on minikube
  • Docker - Deploy Elastic Stack via Helm on minikube




  • Install Elasticsearch 7.6 with Docker

    Elasticsearch is available as Docker images. The images use centos:7 as the base image.

    A list of all published Docker images and tags is available at www.docker.elastic.co. The source files are in https://github.com/elastic/elasticsearch/blob/7.6/distribution/docker.



    Pulling the image

    Issue a docker pull command against the Elastic Docker registry:

    $ docker pull docker.elastic.co/elasticsearch/elasticsearch:7.6.2
    


    Starting a single node cluster

    To start a single-node Elasticsearch cluster for development or testing, we need to specify single-node discovery (by setting discovery.type to single-node). This will elect a node as a master and will not join a cluster with any other node.

    $ docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.6.2
    

    By default, Elasticsearch will use port 9200 for requests and port 9300 for communication between nodes within the cluster.


    To see if it works, simply issue the following:

    $ curl -XGET 'localhost:9200'
    {
      "name" : "caa1097bc4af",
      "cluster_name" : "docker-cluster",
      "cluster_uuid" : "WcBnCZzNS_WR2_0J5H1cdg",
      "version" : {
        "number" : "7.6.2",
        "build_flavor" : "default",
        "build_type" : "docker",
        "build_hash" : "aa751e09be0a5072e8570670309b1f12348f023b",
        "build_date" : "2020-02-29T00:15:25.529771Z",
        "build_snapshot" : false,
        "lucene_version" : "8.4.0",
        "minimum_wire_compatibility_version" : "6.8.0",
        "minimum_index_compatibility_version" : "6.0.0-beta1"
      },
      "tagline" : "You Know, for Search"
    }
    

    To check the cluster health, we will be using the cat API:

    $ curl 'localhost:9200/_cat/health?v'
    epoch      timestamp cluster        status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
    1585433442 22:10:42  docker-cluster green           1         1      0   0    0    0        0             0                  -                100.0%
    

    We can also get a list of nodes in our cluster as follows:

    $ curl 'localhost:9200/_cat/nodes?v'
    ip         heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
    172.17.0.2           13          96   1    0.01    0.01     0.00 dilm      *      caa1097bc4af
    

    Here, we can see our one node named "caa1097bc4af", which is the single node that is currently in our cluster.



    Indices, Types, and Documents

    Adding data to Elasticsearch is called indexing. This is because when we feed data into Elasticsearch, the data is placed into Apache Lucene indexes to store and retrieve its data.

    The easiest and most familiar layout clones what we would expect from a relational database. We can (very roughly) think of an index like a database:

    1. MySQL => Databases => Tables => Columns/Rows
    2. Elasticsearch => Indices => Types => Documents with Properties

    RDBMS (MySQL) Elasticsearch
    Databases Indices
    Tables Types
    Rows Documents
    Columns Fields (Properties of Documents)
    Schema Mapping

    In Elasticsearch, the term document has a specific meaning. It refers to the top-level, or root object that is serialized into JSON and stored in Elasticsearch under a unique ID.

    A document consist not only of its data but also has metadata (information about the document). The three required metadata elements are as follows:

    1. _index: where the document lives
    2. _type: the class of object that the document represents
    3. _id: the unique identifier for the document

    So, the query has the following components:

    1. Index
      An index is the equivalent of database in relational database. The index is the top-most level that can be found at
      http://localhost:9200/<index>
      
    2. Types
      Types are objects that are contained within indexes. They are like tables. Being a child of the index, they can be found at
      http://localhost:9200/<index>/<type>
      
    3. ID
      In order to index a first JSON object, we make a PUT request to the REST API to a URL made up of the index name, type name and ID:
    4. http://localhost:9200/<index>/<type>/[<id>]
      

    Index and type are required while the id part is optional. We can use either the POST or the PUT method to add data to it. If we don't specify an id, ElasticSearch will generate one for us. So, if we don't specify an id we should use POST instead of PUT.



    Indexing

    Let's create an index, "twitter":

    $ curl -X PUT "localhost:9200/twitter?pretty"
    {
      "acknowledged" : true,
      "shards_acknowledged" : true,
      "index" : "twitter"
    }    
    

    Check if the index has been create:

    $ curl "localhost:9200/twitter?pretty"
    {
      "twitter" : {
        "aliases" : { },
        "mappings" : { },
        "settings" : {
          "index" : {
            "creation_date" : "1585580705116",
            "number_of_shards" : "1",
            "number_of_replicas" : "1",
            "uuid" : "wYmyP-t6QFq5eHHGpT_bzg",
            "version" : {
              "created" : "7060199"
            },
            "provided_name" : "twitter"
          }
        }
      }
    }
    

    When creating an index, we can specify the following optional request body:

    1. Index aliases
    2. Mappings for fields in the index
    3. Settings for the index


    Delete:

    $ curl -X DELETE "localhost:9200/twitter?pretty"
    {
      "acknowledged" : true,
    }    
    

    Let's make sure the index has been deleted:

    $ curl "localhost:9200/twitter?pretty"
    {
      "error" : {
        "root_cause" : [
          {
            "type" : "index_not_found_exception",
            "reason" : "no such index [twitter]",
            "resource.type" : "index_or_alias",
            "resource.id" : "twitter",
            "index_uuid" : "_na_",
            "index" : "twitter"
          }
        ],
        "type" : "index_not_found_exception",
        "reason" : "no such index [twitter]",
        "resource.type" : "index_or_alias",
        "resource.id" : "twitter",
        "index_uuid" : "_na_",
        "index" : "twitter"
      },
      "status" : 404
    }
    

    As expected, we get an error because we don't have the "twitter" index any more.



    Indexing with setting

    Now, we want more control over indexing than the above. So, we will create it again. Because we are using a test setup on our local machine, probably, what we want is to use a very minimal index, with just one shard and no replicas like this:

    $ curl -X PUT "localhost:9200/twitter?pretty" -H 'Content-Type: application/json' -d'
    	{
    	    "settings" : {
    	        "index" : {
    	            "number_of_shards" : 1, 
    	            "number_of_replicas" : 0
    	        }
    	    }
    	}
    '
    {
      "acknowledged" : true,
      "shards_acknowledged" : true,
      "index" : "twitter"
    }
    

    Check what we've done:

    $ curl "localhost:9200/twitter?pretty"
    {
      "twitter" : {
        "aliases" : { },
        "mappings" : { },
        "settings" : {
          "index" : {
            "creation_date" : "1585585646565",
            "number_of_shards" : "1",
            "number_of_replicas" : "0",
            "uuid" : "zPuLa5kMTGiq2A16FD4zMg",
            "version" : {
              "created" : "7060199"
            },
            "provided_name" : "twitter"
          }
        }
      }
    }
    

    Internally, elasticsearch is using an Apache Lucene which is a powerful search engine. It stores its data in a file (a shard) which is an unsplittable entity that can only grow by adding documents.

    Shards are used to distribute data over multiple nodes. That's why we only need one shard on a single node system. One thing to note is that the number of shards for an index cannot change after creating the index.

    A replica is a copy of a shard. The shard being copied is called the primary shard, and it can have 0 or more replicas. When we insert data into Elasticsearch, it is stored in the primary shard first, and then in the replicas.




    Indexing with Mapping

    Mapping is the process of defining how a document, and the fields it contains, are stored and indexed.

    $ curl -X PUT "localhost:9200/twitter/_mapping?pretty" -H 'Content-Type: application/json' -d'
    {
    	"properties": {
    	   "age":    { "type": "integer" },  
    	   "email":  { "type": "keyword"  }, 
    	   "name":   { "type": "text"  }     
    	}
    }
    '
    {
      "acknowledged" : true
    }
    

    We can see three field types here: a integer field (could be data field), a keyword field, and a text field.

    We can load data into Elasticsearch without explicitly creating a mapping (this is optional). Elasticsearch will guess the field types and will do a job for us.

    To view the mapping of the "twitter" index:

    $ curl -X GET "localhost:9200/twitter/_mapping?pretty"
    {
      "twitter" : {
        "mappings" : {
          "properties" : {
            "age" : {
              "type" : "integer"
            },
            "email" : {
              "type" : "keyword"
            },
            "name" : {
              "type" : "text"
            }
          }
        }
      }
    }
    

    To view the mapping of a specific field:

    $ curl -X GET "localhost:9200/twitter/_mapping/field/email?pretty"
    {
      "twitter" : {
        "mappings" : {
          "email" : {
            "full_name" : "email",
            "mapping" : {
              "email" : {
                "type" : "keyword"
              }
            }
          }
        }
      }
    }
    


    Loading Data

    Now that we have our index and a mapping, we may want to load some data into Elasticsearch.

    There are several ways of loading data (such as via Kibana, Beats/Logstash, Client library) but here we'll use the Index API to insert data into an index something like this:

    $ curl -X POST "localhost:9200/twitter/_doc?pretty" -H 'Content-Type: application/json' -d'
    	{
    	      "age":    "25",  
    	      "email":  "JohnDoe@gmail.com", 
    	      "name":  "John_Doe"   
    	}
    '
    {
      "_index" : "twitter",
      "_type" : "_doc",
      "_id" : "DFWFLHEBKxEZJmQe-laZ",
      "_version" : 1,
      "result" : "created",
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "failed" : 0
      },
      "_seq_no" : 0,
      "_primary_term" : 1
    }    
    

    As we can see, it created a document id (_id) automatically though we could have chosen our own _id.



    Querying Data

    We can query using _search endpoint of our twitter index:

    $ curl -X GET "localhost:9200/twitter/_search?q=name:John_Doe&pretty"
    {
      "took" : 9,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 1,
          "relation" : "eq"
        },
        "max_score" : 1.3940738,
        "hits" : [
          {
            "_index" : "twitter",
            "_type" : "_doc",
            "_id" : "DFWFLHEBKxEZJmQe-laZ",
            "_score" : 1.3940738,
            "_source" : {
              "age" : "25",
              "email" : "JohnDoe@gmail.com",
              "name" : "John_Doe"
            }
          }
        ]
      }
    }
    

    Another way of searching is by performing a request body search:

    $ curl -X GET "localhost:9200/twitter/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "query" : {
        "term" : {"email" : "JohnDoe@gmail.com" }
      }
    }
    '
    {
      "took" : 5,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 1,
          "relation" : "eq"
        },
        "max_score" : 0.35667494,
        "hits" : [
          {
            "_index" : "twitter",
            "_type" : "_doc",
            "_id" : "DFWFLHEBKxEZJmQe-laZ",
            "_score" : 0.35667494,
            "_source" : {
              "age" : "25",
              "email" : "JohnDoe@gmail.com",
              "name" : "John_Doe"
            }
          }
        ]
      }
    }    
    


    Elasticsearch Query Samples from elastic.co

    Here, we'll play with queries.

    Becuase we may want to use Kibana console, let's install it using docker-compose. Just clone Einsteinish-ELK-Stack-with-docker-compose.

    Before we do that, let's modify the setup for xpack in "elasticsearch/config/elasticsearch.yml" to set "xpack.security.enabled: true". Otherwise, we may get the following error:

     {"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}},"status":401}   
    

    Now, let's run ELK stack using ELK Stack with docker compose.

    $ docker-compose up -d
    Creating network "einsteinish-elk-stack-with-docker-compose_elk" with driver "bridge"
    Creating einsteinish-elk-stack-with-docker-compose_elasticsearch_1 ... done
    Creating einsteinish-elk-stack-with-docker-compose_kibana_1        ... done
    Creating einsteinish-elk-stack-with-docker-compose_logstash_1      ... done
    
    $ docker-compose ps
                              Name                                         Command               State                                        Ports                                      
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    einsteinish-elk-stack-with-docker-compose_elasticsearch_1   /usr/local/bin/docker-entr ...   Up      0.0.0.0:9200->9200/tcp, 0.0.0.0:9300->9300/tcp                                  
    einsteinish-elk-stack-with-docker-compose_kibana_1          /usr/local/bin/dumb-init - ...   Up      0.0.0.0:5601->5601/tcp                                                          
    einsteinish-elk-stack-with-docker-compose_logstash_1        /usr/local/bin/docker-entr ...   Up      0.0.0.0:5000->5000/tcp, 0.0.0.0:5000->5000/udp, 5044/tcp, 0.0.0.0:9600->9600/tcp
    $ 
    

    We'll start to using cat APIs which are only intended for human consumption using the Kibana console or command line.

    To list all available command:

    $ curl -X GET "localhost:9200/_cat"
    =^.^=
    /_cat/allocation
    /_cat/shards
    /_cat/shards/{index}
    /_cat/master
    /_cat/nodes
    /_cat/tasks
    /_cat/indices
    /_cat/indices/{index}
    /_cat/segments
    /_cat/segments/{index}
    /_cat/count
    /_cat/count/{index}
    /_cat/recovery
    /_cat/recovery/{index}
    /_cat/health
    /_cat/pending_tasks
    /_cat/aliases
    /_cat/aliases/{alias}
    /_cat/thread_pool
    /_cat/thread_pool/{thread_pools}
    /_cat/plugins
    /_cat/fielddata
    /_cat/fielddata/{fields}
    /_cat/nodeattrs
    /_cat/repositories
    /_cat/snapshots/{repository}
    /_cat/templates
    

    Each of the _cat commands accepts a query string parameter v to turn on verbose output. For example:

    Kibana_DevTools_Console.png

    where we used Kibana console.

    $ curl -X GET "localhost:9200/_cat/nodes?"
    192.168.96.2 60 92 5 0.35 0.32 0.41 dilm * 8b73d9076e68
    

    h query string parameter which forces only those columns to appear:

    $ curl -X GET "localhost:9200/_cat/nodes?h=ip,port,heapPercent,name&pretty"
    192.168.96.2 9300 67 8b73d9076e68
    

    We can also request multiple columns using simple wildcards like /_cat/thread_pool?h=ip,queue* to get all headers (or aliases) starting with queue.

    $ curl -X GET "localhost:9200/_cat/thread_pool?h=ip,queue*"
    192.168.96.2 0   16
    192.168.96.2 0  100
    ...
    192.168.96.2 0   -1
    192.168.96.2 0    4
    192.168.96.2 0   -1
    192.168.96.2 0 1000
    192.168.96.2 0  200
    

    If we want to find the largest index in our cluster (storage used by all the shards, not number of documents). The /_cat/indices API is ideal. We only need to add three things to the API request:

    1. The bytes query string parameter with a value of b to get byte-level resolution.
    2. The s (sort) parameter with a value of store.size:desc to sort the output by shard storage in descending order.
    3. The v (verbose) parameter to include column headings in the response.
    $ curl -X GET "localhost:9200/_cat/indices?bytes=b&s=store.size:desc&v"
    health status index                             uuid                   pri rep docs.count docs.deleted store.size pri.store.size
    green  open   .monitoring-es-7-2020.04.06       XQSHRxs7RsOZ3ZeLsa0Y_Q   1   0      77410        55160   34506536       34506536
    green  open   .monitoring-es-7-2020.04.10       Gjs4h4dqTIWKC3owN0cjqQ   1   0       9435         1443   10628315       10628315
    green  open   .monitoring-logstash-7-2020.04.06 6itFo78lShiWIb1e5i6GKg   1   0      43969            0    3007501        3007501
    green  open   .monitoring-kibana-7-2020.04.06   LWbl13UVQLq_cWwkCccatA   1   0       5522            0    1220685        1220685
    green  open   .monitoring-logstash-7-2020.04.10 myGRrPNMRlebzYfOdBbHUQ   1   0       2577            0     542115         542115
    green  open   .monitoring-kibana-7-2020.04.10   knd52K_vSTellI_qhGPYpA   1   0        544            0     269953         269953
    green  open   .security-7                       e21FT4JoQ2WML_oax2GFYA   1   0         36            0      99098          99098
    green  open   .kibana_1                         2yJ-CzinQ-Czv0H3Rg4mQg   1   0         10            1      39590          39590
    yellow open   logstash-2020.04.06-000001        YShJ9NKUQO-4TuwhS0MlXA   1   1        100            0      36727          36727
    green  open   ilm-history-1-000001              rVfV3nLQSXOM7c6yN68dbg   1   0         18            0      32919          32919
    green  open   .kibana_task_manager_1            y29CTX98TEuZt3pb6lnXhA   1   0          2            0       6823           6823
    green  open   .apm-agent-configuration          zC5fg2AhSVK0TvV3WUcv_Q   1   0          0            0        283            283
    

    The following queries give the same response in json format:

    $ curl 'localhost:9200/_cat/indices?format=json&pretty'
    [
      {
        "health" : "green",
        "status" : "open",
        "index" : ".security-7",
        "uuid" : "e21FT4JoQ2WML_oax2GFYA",
        "pri" : "1",
        "rep" : "0",
        "docs.count" : "36",
        "docs.deleted" : "0",
        "store.size" : "96.7kb",
        "pri.store.size" : "96.7kb"
      },
      ...
      
    $ curl 'localhost:9200/_cat/indices?pretty' -H "Accept: application/json"
    [
      {
        "health" : "green",
        "status" : "open",
        "index" : ".security-7",
        "uuid" : "e21FT4JoQ2WML_oax2GFYA",
        "pri" : "1",
        "rep" : "0",
        "docs.count" : "36",
        "docs.deleted" : "0",
        "store.size" : "96.7kb",
        "pri.store.size" : "96.7kb"
      },
    

    s query string parameter which sorts the table by the columns specified as the parameter value. Columns are specified either by name or by alias, and are provided as a comma separated string. By default, sorting is done in ascending fashion. Appending :desc to a column will invert the ordering for that column. :asc is also accepted but exhibits the same behavior as the default sort order.

    For example, with a sort string s=column1,column2:desc,column3, the table will be sorted in ascending order by column1, in descending order by column2, and in ascending order by column3.


    Let's put JSON documents into an Elasticsearch index.

    We can do this directly with a simple PUT request that specifies the index we want to add the document, a unique document ID, and one or more "field": "value" pairs in the request body:

    $ curl -X PUT "localhost:9200/customer/_doc/1?pretty" -H 'Content-Type: application/json' -d'
    {
      "name": "John Doe"
    }
    '
    {
      "_index" : "customer",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 1,
      "result" : "created",
      "_shards" : {
        "total" : 2,
        "successful" : 1,
        "failed" : 0
      },
      "_seq_no" : 0,
      "_primary_term" : 1
    }
    

    This request automatically creates the customer index if it doesn’t already exist, adds a new document that has an ID of 1, and stores and indexes the name field.


    The new document is available immediately from any node in the cluster. We can retrieve it with a GET request that specifies its document ID:

    $ curl -X GET "localhost:9200/customer/_doc/1?pretty"
    {
      "_index" : "customer",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 1,
      "_seq_no" : 0,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "name" : "John Doe"
      }
    }
    

    If we have a lot of documents to index, we can submit them in batches with the https://www.elastic.co/guide/en/elasticsearch/reference/7.6/docs-bulk.htmlbulk API.

    Let's download the accounts.json sample data set which is randomly-generated data set represent user accounts with the following information:

    {
        "account_number": 0,
        "balance": 16623,
        "firstname": "Bradshaw",
        "lastname": "Mckenzie",
        "age": 29,
        "gender": "F",
        "address": "244 Columbus Place",
        "employer": "Euron",
        "email": "bradshawmckenzie@euron.com",
        "city": "Hobucken",
        "state": "CO"
    }
    

    We're going to index the account data into the bank index with the following _bulk request:

    $ curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_bulk?pretty&refresh" --data-binary "@accounts.json"  
    {
      "took" : 711,
      "errors" : false,
      "items" : [
        {
          "index" : {
            "_index" : "bank",
            "_type" : "_doc",
            "_id" : "1",
            "_version" : 1,
            "result" : "created",
            "forced_refresh" : true,
            "_shards" : {
              "total" : 2,
              "successful" : 1,
              "failed" : 0
            },
            "_seq_no" : 0,
            "_primary_term" : 1,
            "status" : 201
          }
        },
    ...
        {
          "index" : {
            "_index" : "bank",
            "_type" : "_doc",
            "_id" : "995",
            "_version" : 1,
            "result" : "created",
            "forced_refresh" : true,
            "_shards" : {
              "total" : 2,
              "successful" : 1,
              "failed" : 0
            },
            "_seq_no" : 999,
            "_primary_term" : 1,
            "status" : 201
          }
        }
      ]
    }
    

    The --data-binary posts data exactly as specified with no extra processing whatsoever while --data or -d sends the specified data in a POST request to the HTTP server, in the same way that a browser does when a user has filled in an HTML form and presses the submit button. This will cause curl to pass the data to the server using the content-type application/x-www-form-urlencoded.


    We can check if the 1,000 documents were indexed successfully:

    $ curl -X GET "localhost:9200/_cat/indices?v&s=index&pretty"
    health status index                             uuid                   pri rep docs.count docs.deleted store.size pri.store.size
    green  open   .apm-agent-configuration          zC5fg2AhSVK0TvV3WUcv_Q   1   0          0            0       283b           283b
    green  open   .kibana_1                         2yJ-CzinQ-Czv0H3Rg4mQg   1   0         10            1     38.6kb         38.6kb
    green  open   .kibana_task_manager_1            y29CTX98TEuZt3pb6lnXhA   1   0          2            0       32kb           32kb
    green  open   .monitoring-es-7-2020.04.06       XQSHRxs7RsOZ3ZeLsa0Y_Q   1   0      77410        55160     32.9mb         32.9mb
    green  open   .monitoring-es-7-2020.04.10       Gjs4h4dqTIWKC3owN0cjqQ   1   0      30096            0     10.7mb         10.7mb
    green  open   .monitoring-es-7-2020.04.11       FMSpb4JKScGYhu8nEzXt1A   1   0         95           18    695.5kb        695.5kb
    green  open   .monitoring-kibana-7-2020.04.06   LWbl13UVQLq_cWwkCccatA   1   0       5522            0      1.1mb          1.1mb
    green  open   .monitoring-kibana-7-2020.04.10   knd52K_vSTellI_qhGPYpA   1   0       1752            0    534.6kb        534.6kb
    green  open   .monitoring-kibana-7-2020.04.11   GxM1BDKvRkGTv0gWyt8U_A   1   0          3            0     42.9kb         42.9kb
    green  open   .monitoring-logstash-7-2020.04.06 6itFo78lShiWIb1e5i6GKg   1   0      43969            0      2.8mb          2.8mb
    green  open   .monitoring-logstash-7-2020.04.10 myGRrPNMRlebzYfOdBbHUQ   1   0       8617            0      1.1mb          1.1mb
    green  open   .monitoring-logstash-7-2020.04.11 TAjxbas0Rd-5mb2JOmvr0A   1   0         15            0     95.5kb         95.5kb
    green  open   .security-7                       e21FT4JoQ2WML_oax2GFYA   1   0         36            0     96.7kb         96.7kb
    yellow open   bank                              bDhhObs0SMiHpPJti21rZA   1   1       1000            0    414.1kb        414.1kb
    yellow open   customer                          Q68qN_NBSOqz3dnWG6P0yQ   1   1          1            0      3.4kb          3.4kb
    green  open   ilm-history-1-000001              rVfV3nLQSXOM7c6yN68dbg   1   0         18            0     32.1kb         32.1kb
    yellow open   logstash-2020.04.06-000001        YShJ9NKUQO-4TuwhS0MlXA   1   1        100            0     35.8kb         35.8kb    
    

    Just to see the bank index:

    $ curl -X GET "localhost:9200/_cat/indices/bank?v&pretty"
    health status index uuid                   pri rep docs.count docs.deleted store.size pri.store.size
    yellow open   bank  bDhhObs0SMiHpPJti21rZA   1   1       1000            0    414.1kb        414.1kb    
    


    Elasticsearch Search Samples from elastic.co

    Now that we have ingested some data into an Elasticsearch index, we can search it by sending requests to the _search endpoint. To access the full suite of search capabilities, we use the Elasticsearch Query DSL to specify the search criteria in the request body. We specify the name of the index we want to search in the request URI.

    The following request, for example, retrieves all documents in the bank index sorted by account number:

    $ curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "query": { "match_all": {} },
      "sort": [
        { "account_number": "asc" }
      ]
    }
    ' 
    {
      "query": { "match_all": {} },
      "sort": [
        { "account_number": "asc" }
      ]
    }
    '
    {
      "took" : 3,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 1000,
          "relation" : "eq"
        },
        "max_score" : null,
        "hits" : [
          {
            "_index" : "bank",
            "_type" : "_doc",
            "_id" : "0",
            "_score" : null,
            "_source" : {
              "account_number" : 0,
              "balance" : 16623,
              "firstname" : "Bradshaw",
              "lastname" : "Mckenzie",
              "age" : 29,
              "gender" : "F",
              "address" : "244 Columbus Place",
              "employer" : "Euron",
              "email" : "bradshawmckenzie@euron.com",
              "city" : "Hobucken",
              "state" : "CO"
            },
            "sort" : [
              0
            ]
          },
    ...
          {
            "_index" : "bank",
            "_type" : "_doc",
            "_id" : "9",
            "_score" : null,
            "_source" : {
              "account_number" : 9,
              "balance" : 24776,
              "firstname" : "Opal",
              "lastname" : "Meadows",
              "age" : 39,
              "gender" : "M",
              "address" : "963 Neptune Avenue",
              "employer" : "Cedward",
              "email" : "opalmeadows@cedward.com",
              "city" : "Olney",
              "state" : "OH"
            },
            "sort" : [
              9
            ]
          }
        ]
      }
    }
    

    As we can see from th eoutput above, by default, the hits section of the response includes the first 10 documents that match the search criteria.


    The response also provides the following information about the search request:

    1. took – how long it took Elasticsearch to run the query, in milliseconds
    2. timed_out – whether or not the search request timed out
    3. _shards – how many shards were searched and a breakdown of how many shards succeeded, failed, or were skipped.
    4. hits.total.value - how many matching documents were found
    5. hits.max_score – the score of the most relevant document found
    6. hits.sort - the document’s sort position (when not sorting by relevance score)
    7. hits._score - the document’s relevance score (not applicable when using match_all)

    To page through the search hits, specify the from and size parameters in our request. For example, the following request gets hits 10 through 12:

    $ curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "query": { "match_all": {} },
      "sort": [
        { "account_number": "asc" }
      ],
      "from": 10,
      "size": 3
    }
    '
    {
      "took" : 15,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 1000,
          "relation" : "eq"
        },
        "max_score" : null,
        "hits" : [
          {
            "_index" : "bank",
            "_type" : "_doc",
            "_id" : "10",
            "_score" : null,
            "_source" : {
              "account_number" : 10,
              "balance" : 46170,
              "firstname" : "Dominique",
              "lastname" : "Park",
              "age" : 37,
              "gender" : "F",
              "address" : "100 Gatling Place",
              "employer" : "Conjurica",
              "email" : "dominiquepark@conjurica.com",
              "city" : "Omar",
              "state" : "NJ"
            },
            "sort" : [
              10
            ]
          },
          {
            "_index" : "bank",
            "_type" : "_doc",
            "_id" : "11",
            "_score" : null,
            "_source" : {
              "account_number" : 11,
              "balance" : 20203,
              "firstname" : "Jenkins",
              "lastname" : "Haney",
              "age" : 20,
              "gender" : "M",
              "address" : "740 Ferry Place",
              "employer" : "Qimonk",
              "email" : "jenkinshaney@qimonk.com",
              "city" : "Steinhatchee",
              "state" : "GA"
            },
            "sort" : [
              11
            ]
          },
          {
            "_index" : "bank",
            "_type" : "_doc",
            "_id" : "12",
            "_score" : null,
            "_source" : {
              "account_number" : 12,
              "balance" : 37055,
              "firstname" : "Stafford",
              "lastname" : "Brock",
              "age" : 20,
              "gender" : "F",
              "address" : "296 Wythe Avenue",
              "employer" : "Uncorp",
              "email" : "staffordbrock@uncorp.com",
              "city" : "Bend",
              "state" : "AL"
            },
            "sort" : [
              12
            ]
          }
        ]
      }
    }    
    

    Now can start to construct queries that are a bit more interesting than match_all.

    To search for specific terms within a field, we can use a match query. For example, the following request searches the address field to find customers whose addresses contain mill or lane:

    $ curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "query": { "match": { "address": "mill lane" } }
    }
    '
    {
      "took" : 18,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 19,
          "relation" : "eq"
        },
        "max_score" : 9.507477,
        "hits" : [
          {
            "_index" : "bank",
            "_type" : "_doc",
            "_id" : "136",
            "_score" : 9.507477,
            "_source" : {
              "account_number" : 136,
              "balance" : 45801,
              "firstname" : "Winnie",
              "lastname" : "Holland",
              "age" : 38,
              "gender" : "M",
              "address" : "198 Mill Lane",
              "employer" : "Neteria",
              "email" : "winnieholland@neteria.com",
              "city" : "Urie",
              "state" : "IL"
            }
          },  
          {
            "_index" : "bank",
            "_type" : "_doc",
            "_id" : "970",
            "_score" : 5.4032025,
            "_source" : {
              "account_number" : 970,
              "balance" : 19648,
              "firstname" : "Forbes",
              "lastname" : "Wallace",
              "age" : 28,
              "gender" : "M",
              "address" : "990 Mill Road",
              "employer" : "Pheast",
              "email" : "forbeswallace@pheast.com",
              "city" : "Lopezo",
              "state" : "AK"
            }
          },
    

    To perform a phrase search rather than matching individual terms, we use match_phrase instead of match. For example, the following request only matches addresses that contain the phrase mill lane:

    $ curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "query": { "match_phrase": { "address": "mill lane" } }
    }
    ' 
    {
      "took" : 45,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 1,
          "relation" : "eq"
        },
        "max_score" : 9.507477,
        "hits" : [
          {
            "_index" : "bank",
            "_type" : "_doc",
            "_id" : "136",
            "_score" : 9.507477,
            "_source" : {
              "account_number" : 136,
              "balance" : 45801,
              "firstname" : "Winnie",
              "lastname" : "Holland",
              "age" : 38,
              "gender" : "M",
              "address" : "198 Mill Lane",
              "employer" : "Neteria",
              "email" : "winnieholland@neteria.com",
              "city" : "Urie",
              "state" : "IL"
            }
          }
        ]
      }
    }
    

    To construct more complex queries, we can use a bool query to combine multiple query criteria. We can designate criteria as required (must match), desirable (should match), or undesirable (must not match).

    For example, the following request searches the bank index for accounts that belong to customers who are 33 years old, but excludes anyone who lives in Idaho (ID):

    $ curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "query": {
        "bool": {
          "must": [
            { "match": { "age": "33" } }
          ],
          "must_not": [
            { "match": { "state": "ID" } }
          ]
        }
      }
    }
    '
    {
      "took" : 6,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 50,
          "relation" : "eq"
        },
        "max_score" : 1.0,
        "hits" : [
          {
            "_index" : "bank",
            "_type" : "_doc",
            "_id" : "18",
            "_score" : 1.0,
            "_source" : {
              "account_number" : 18,
              "balance" : 4180,
              "firstname" : "Dale",
              "lastname" : "Adams",
              "age" : 33,
              "gender" : "M",
              "address" : "467 Hutchinson Court",
              "employer" : "Boink",
              "email" : "daleadams@boink.com",
              "city" : "Orick",
              "state" : "MD"
            }
          },
          ...
    

    Each must, should, and must_not element in a Boolean query is referred to as a query clause. How well a document meets the criteria in each must or should clause contributes to the document’s relevance score. The higher the score, the better the document matches our search criteria. By default, Elasticsearch returns documents ranked by these relevance scores.

    The criteria in a must_not clause is treated as a filter. It affects whether or not the document is included in the results, but does not contribute to how documents are scored. We can also explicitly specify arbitrary filters to include or exclude documents based on structured data.

    For example, the following request uses a range filter to limit the results to accounts with a balance between $20,000 and $30,000 (inclusive).

    $ curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "query": {
        "bool": {
          "must": { "match_all": {} },
          "filter": {
            "range": {
              "balance": {
                "gte": 20000,
                "lte": 30000
              }
            }
          }
        }
      }
    }
    '
    {
      "took" : 3,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 217,
          "relation" : "eq"
        },
        "max_score" : 1.0,
        "hits" : [
          {
            "_index" : "bank",
            "_type" : "_doc",
            "_id" : "49",
            "_score" : 1.0,
            "_source" : {
              "account_number" : 49,
              "balance" : 29104,
              "firstname" : "Fulton",
              "lastname" : "Holt",
              "age" : 23,
              "gender" : "F",
              "address" : "451 Humboldt Street",
              "employer" : "Anocha",
              "email" : "fultonholt@anocha.com",
              "city" : "Sunriver",
              "state" : "RI"
            }
          },
    


    Elasticsearch Analyze Samples from elastic.co

    Elasticsearch aggregations enable us to get meta-information about our search results and answer questions like, "How many account holders are in Texas?" or "What’s the average balance of accounts in Tennessee?" We can search documents, filter hits, and use aggregations to analyze the results all in one request.

    For example, the following request uses a terms aggregation to group all of the accounts in the bank index by state, and returns the ten states with the most accounts in descending order:

    $ curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "size": 0,
      "aggs": {
        "group_by_state": {
          "terms": {
            "field": "state.keyword"
          }
        }
      }
    }
    '
    {
      "took" : 11,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 1000,
          "relation" : "eq"
        },
        "max_score" : null,
        "hits" : [ ]
      },
      "aggregations" : {
        "group_by_state" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 743,
          "buckets" : [
            {
              "key" : "TX",
              "doc_count" : 30
            },
            {
              "key" : "MD",
              "doc_count" : 28
            },
            {
              "key" : "ID",
              "doc_count" : 27
            },
            {
              "key" : "AL",
              "doc_count" : 25
            },
            {
              "key" : "ME",
              "doc_count" : 25
            },
            {
              "key" : "TN",
              "doc_count" : 25
            },
            {
              "key" : "WY",
              "doc_count" : 25
            },
            {
              "key" : "DC",
              "doc_count" : 24
            },
            {
              "key" : "MA",
              "doc_count" : 24
            },
            {
              "key" : "ND",
              "doc_count" : 24
            }
          ]
        }
      }
    }    
    

    The buckets in the response are the values of the state field. The doc_count shows the number of accounts in each state. For example, we can see that there are 27 accounts in ID (Idaho). Because the request set size=0, the response only contains the aggregation results but not including the details of the accounts like this:

        "hits" : [
          {
            "_index" : "bank",
            "_type" : "_doc",
            "_id" : "1",
            "_score" : 1.0,
            "_source" : {
              "account_number" : 1,
              "balance" : 39225,
              "firstname" : "Amber",
              "lastname" : "Duke",
              "age" : 32,
              "gender" : "M",
              "address" : "880 Holmes Lane",
              "employer" : "Pyrami",
              "email" : "amberduke@pyrami.com",
              "city" : "Brogan",
              "state" : "IL"
            }
          },
          ...
    

    We can combine aggregations to build more complex summaries of our data. For example, the following request nests an avg aggregation within the previous group_by_state aggregation to calculate the average account balances for each state.

    $ curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "size": 0,
      "aggs": {
        "group_by_state": {
          "terms": {
            "field": "state.keyword"
          },
          "aggs": {
            "average_balance": {
              "avg": {
                "field": "balance"
              }
            }
          }
        }
      }
    }
    '
    {
      "took" : 38,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 1000,
          "relation" : "eq"
        },
        "max_score" : null,
        "hits" : [ ]
      },
      "aggregations" : {
        "group_by_state" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 743,
          "buckets" : [
            {
              "key" : "TX",
              "doc_count" : 30,
              "average_balance" : {
                "value" : 26073.3
              }
            },
            {
              "key" : "MD",
              "doc_count" : 28,
              "average_balance" : {
                "value" : 26161.535714285714
              }
            },
            {
              "key" : "ID",
              "doc_count" : 27,
              "average_balance" : {
                "value" : 24368.777777777777
              }
            },
            {
              "key" : "AL",
              "doc_count" : 25,
              "average_balance" : {
                "value" : 25739.56
              }
            },
            {
              "key" : "ME",
              "doc_count" : 25,
              "average_balance" : {
                "value" : 21663.0
              }
            },
            {
              "key" : "TN",
              "doc_count" : 25,
              "average_balance" : {
                "value" : 28365.4
              }
            },
            {
              "key" : "WY",
              "doc_count" : 25,
              "average_balance" : {
                "value" : 21731.52
              }
            },
            {
              "key" : "DC",
              "doc_count" : 24,
              "average_balance" : {
                "value" : 23180.583333333332
              }
            },
            {
              "key" : "MA",
              "doc_count" : 24,
              "average_balance" : {
                "value" : 29600.333333333332
              }
            },
            {
              "key" : "ND",
              "doc_count" : 24,
              "average_balance" : {
                "value" : 26577.333333333332
              }
            }
          ]
        }
      }
    }    
    

    Instead of sorting the results by count, we could sort using the result of the nested aggregation by specifying the order within the terms aggregation:

    $ curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "size": 0,
      "aggs": {
        "group_by_state": {
          "terms": {
            "field": "state.keyword",
            "order": {
              "average_balance": "desc"
            }
          },
          "aggs": {
            "average_balance": {
              "avg": {
                "field": "balance"
              }
            }
          }
        }
      }
    }
    '
    
    {
      "took" : 37,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 1000,
          "relation" : "eq"
        },
        "max_score" : null,
        "hits" : [ ]
      },
      "aggregations" : {
        "group_by_state" : {
          "doc_count_error_upper_bound" : -1,
          "sum_other_doc_count" : 827,
          "buckets" : [
            {
              "key" : "CO",
              "doc_count" : 14,
              "average_balance" : {
                "value" : 32460.35714285714
              }
            },
            {
              "key" : "NE",
              "doc_count" : 16,
              "average_balance" : {
                "value" : 32041.5625
              }
            },
            {
              "key" : "AZ",
              "doc_count" : 14,
              "average_balance" : {
                "value" : 31634.785714285714
              }
            },
            {
              "key" : "MT",
              "doc_count" : 17,
              "average_balance" : {
                "value" : 31147.41176470588
              }
            },
            {
              "key" : "VA",
              "doc_count" : 16,
              "average_balance" : {
                "value" : 30600.0625
              }
            },
            {
              "key" : "GA",
              "doc_count" : 19,
              "average_balance" : {
                "value" : 30089.0
              }
            },
            {
              "key" : "MA",
              "doc_count" : 24,
              "average_balance" : {
                "value" : 29600.333333333332
              }
            },
            {
              "key" : "IL",
              "doc_count" : 22,
              "average_balance" : {
                "value" : 29489.727272727272
              }
            },
            {
              "key" : "NM",
              "doc_count" : 14,
              "average_balance" : {
                "value" : 28792.64285714286
              }
            },
            {
              "key" : "LA",
              "doc_count" : 17,
              "average_balance" : {
                "value" : 28791.823529411766
              }
            }
          ]
        }
      }
    }    
    


    Docker & K8s

    1. Docker install on Amazon Linux AMI
    2. Docker install on EC2 Ubuntu 14.04
    3. Docker container vs Virtual Machine
    4. Docker install on Ubuntu 14.04
    5. Docker Hello World Application
    6. Nginx image - share/copy files, Dockerfile
    7. Working with Docker images : brief introduction
    8. Docker image and container via docker commands (search, pull, run, ps, restart, attach, and rm)
    9. More on docker run command (docker run -it, docker run --rm, etc.)
    10. Docker Networks - Bridge Driver Network
    11. Docker Persistent Storage
    12. File sharing between host and container (docker run -d -p -v)
    13. Linking containers and volume for datastore
    14. Dockerfile - Build Docker images automatically I - FROM, MAINTAINER, and build context
    15. Dockerfile - Build Docker images automatically II - revisiting FROM, MAINTAINER, build context, and caching
    16. Dockerfile - Build Docker images automatically III - RUN
    17. Dockerfile - Build Docker images automatically IV - CMD
    18. Dockerfile - Build Docker images automatically V - WORKDIR, ENV, ADD, and ENTRYPOINT
    19. Docker - Apache Tomcat
    20. Docker - NodeJS
    21. Docker - NodeJS with hostname
    22. Docker Compose - NodeJS with MongoDB
    23. Docker - Prometheus and Grafana with Docker-compose
    24. Docker - StatsD/Graphite/Grafana
    25. Docker - Deploying a Java EE JBoss/WildFly Application on AWS Elastic Beanstalk Using Docker Containers
    26. Docker : NodeJS with GCP Kubernetes Engine
    27. Docker : Jenkins Multibranch Pipeline with Jenkinsfile and Github
    28. Docker : Jenkins Master and Slave
    29. Docker - ELK : ElasticSearch, Logstash, and Kibana
    30. Docker - ELK 7.6 : Elasticsearch on Centos 7
    31. Docker - ELK 7.6 : Filebeat on Centos 7
    32. Docker - ELK 7.6 : Logstash on Centos 7
    33. Docker - ELK 7.6 : Kibana on Centos 7
    34. Docker - ELK 7.6 : Elastic Stack with Docker Compose
    35. Docker - Deploy Elastic Cloud on Kubernetes (ECK) via Elasticsearch operator on minikube
    36. Docker - Deploy Elastic Stack via Helm on minikube
    37. Docker Compose - A gentle introduction with WordPress
    38. Docker Compose - MySQL
    39. MEAN Stack app on Docker containers : micro services
    40. MEAN Stack app on Docker containers : micro services via docker-compose
    41. Docker Compose - Hashicorp's Vault and Consul Part A (install vault, unsealing, static secrets, and policies)
    42. Docker Compose - Hashicorp's Vault and Consul Part B (EaaS, dynamic secrets, leases, and revocation)
    43. Docker Compose - Hashicorp's Vault and Consul Part C (Consul)
    44. Docker Compose with two containers - Flask REST API service container and an Apache server container
    45. Docker compose : Nginx reverse proxy with multiple containers
    46. Docker & Kubernetes : Envoy - Getting started
    47. Docker & Kubernetes : Envoy - Front Proxy
    48. Docker & Kubernetes : Ambassador - Envoy API Gateway on Kubernetes
    49. Docker Packer
    50. Docker Cheat Sheet
    51. Docker Q & A #1
    52. Kubernetes Q & A - Part I
    53. Kubernetes Q & A - Part II
    54. Docker - Run a React app in a docker
    55. Docker - Run a React app in a docker II (snapshot app with nginx)
    56. Docker - NodeJS and MySQL app with React in a docker
    57. Docker - Step by Step NodeJS and MySQL app with React - I
    58. Installing LAMP via puppet on Docker
    59. Docker install via Puppet
    60. Nginx Docker install via Ansible
    61. Apache Hadoop CDH 5.8 Install with QuickStarts Docker
    62. Docker - Deploying Flask app to ECS
    63. Docker Compose - Deploying WordPress to AWS
    64. Docker - WordPress Deploy to ECS with Docker-Compose (ECS-CLI EC2 type)
    65. Docker - WordPress Deploy to ECS with Docker-Compose (ECS-CLI Fargate type)
    66. Docker - ECS Fargate
    67. Docker - AWS ECS service discovery with Flask and Redis
    68. Docker & Kubernetes : minikube
    69. Docker & Kubernetes 2 : minikube Django with Postgres - persistent volume
    70. Docker & Kubernetes 3 : minikube Django with Redis and Celery
    71. Docker & Kubernetes 4 : Django with RDS via AWS Kops
    72. Docker & Kubernetes : Kops on AWS
    73. Docker & Kubernetes : Ingress controller on AWS with Kops
    74. Docker & Kubernetes : HashiCorp's Vault and Consul on minikube
    75. Docker & Kubernetes : HashiCorp's Vault and Consul - Auto-unseal using Transit Secrets Engine
    76. Docker & Kubernetes : Persistent Volumes & Persistent Volumes Claims - hostPath and annotations
    77. Docker & Kubernetes : Persistent Volumes - Dynamic volume provisioning
    78. Docker & Kubernetes : DaemonSet
    79. Docker & Kubernetes : Secrets
    80. Docker & Kubernetes : kubectl command
    81. Docker & Kubernetes : Assign a Kubernetes Pod to a particular node in a Kubernetes cluster
    82. Docker & Kubernetes : Configure a Pod to Use a ConfigMap
    83. AWS : EKS (Elastic Container Service for Kubernetes)
    84. Docker & Kubernetes : Run a React app in a minikube
    85. Docker & Kubernetes : Minikube install on AWS EC2
    86. Docker & Kubernetes : Cassandra with a StatefulSet
    87. Docker & Kubernetes : Terraform and AWS EKS
    88. Docker & Kubernetes : Pods and Service definitions
    89. Docker & Kubernetes : Service IP and the Service Type
    90. Docker & Kubernetes : Kubernetes DNS with Pods and Services
    91. Docker & Kubernetes : Headless service and discovering pods
    92. Docker & Kubernetes : Scaling and Updating application
    93. Docker & Kubernetes : Horizontal pod autoscaler on minikubes
    94. Docker & Kubernetes : From a monolithic app to micro services on GCP Kubernetes
    95. Docker & Kubernetes : Rolling updates
    96. Docker & Kubernetes : Deployments to GKE (Rolling update, Canary and Blue-green deployments)
    97. Docker & Kubernetes : Slack Chat Bot with NodeJS on GCP Kubernetes
    98. Docker & Kubernetes : Continuous Delivery with Jenkins Multibranch Pipeline for Dev, Canary, and Production Environments on GCP Kubernetes
    99. Docker & Kubernetes : NodePort vs LoadBalancer vs Ingress
    100. Docker & Kubernetes : MongoDB / MongoExpress on Minikube
    101. Docker & Kubernetes : Load Testing with Locust on GCP Kubernetes
    102. Docker & Kubernetes : MongoDB with StatefulSets on GCP Kubernetes Engine
    103. Docker & Kubernetes : Nginx Ingress Controller on Minikube
    104. Docker & Kubernetes : Setting up Ingress with NGINX Controller on Minikube (Mac)
    105. Docker & Kubernetes : Nginx Ingress Controller for Dashboard service on Minikube
    106. Docker & Kubernetes : Nginx Ingress Controller on GCP Kubernetes
    107. Docker & Kubernetes : Kubernetes Ingress with AWS ALB Ingress Controller in EKS
    108. Docker & Kubernetes : Setting up a private cluster on GCP Kubernetes
    109. Docker & Kubernetes : Kubernetes Namespaces (default, kube-public, kube-system) and switching namespaces (kubens)
    110. Docker & Kubernetes : StatefulSets on minikube
    111. Docker & Kubernetes : RBAC
    112. Docker & Kubernetes Service Account, RBAC, and IAM
    113. Docker & Kubernetes - Kubernetes Service Account, RBAC, IAM with EKS ALB, Part 1
    114. Docker & Kubernetes : Helm Chart
    115. Docker & Kubernetes : My first Helm deploy
    116. Docker & Kubernetes : Readiness and Liveness Probes
    117. Docker & Kubernetes : Helm chart repository with Github pages
    118. Docker & Kubernetes : Deploying WordPress and MariaDB with Ingress to Minikube using Helm Chart
    119. Docker & Kubernetes : Deploying WordPress and MariaDB to AWS using Helm 2 Chart
    120. Docker & Kubernetes : Deploying WordPress and MariaDB to AWS using Helm 3 Chart
    121. Docker & Kubernetes : Helm Chart for Node/Express and MySQL with Ingress
    122. Docker & Kubernetes : Deploy Prometheus and Grafana using Helm and Prometheus Operator - Monitoring Kubernetes node resources out of the box
    123. Docker & Kubernetes : Deploy Prometheus and Grafana using kube-prometheus-stack Helm Chart
    124. Docker & Kubernetes : Istio (service mesh) sidecar proxy on GCP Kubernetes
    125. Docker & Kubernetes : Istio on EKS
    126. Docker & Kubernetes : Istio on Minikube with AWS EC2 for Bookinfo Application
    127. Docker & Kubernetes : Deploying .NET Core app to Kubernetes Engine and configuring its traffic managed by Istio (Part I)
    128. Docker & Kubernetes : Deploying .NET Core app to Kubernetes Engine and configuring its traffic managed by Istio (Part II - Prometheus, Grafana, pin a service, split traffic, and inject faults)
    129. Docker & Kubernetes : Helm Package Manager with MySQL on GCP Kubernetes Engine
    130. Docker & Kubernetes : Deploying Memcached on Kubernetes Engine
    131. Docker & Kubernetes : EKS Control Plane (API server) Metrics with Prometheus
    132. Docker & Kubernetes : Spinnaker on EKS with Halyard
    133. Docker & Kubernetes : Continuous Delivery Pipelines with Spinnaker and Kubernetes Engine
    134. Docker & Kubernetes : Multi-node Local Kubernetes cluster : Kubeadm-dind (docker-in-docker)
    135. Docker & Kubernetes : Multi-node Local Kubernetes cluster : Kubeadm-kind (k8s-in-docker)
    136. Docker & Kubernetes : nodeSelector, nodeAffinity, taints/tolerations, pod affinity and anti-affinity - Assigning Pods to Nodes
    137. Docker & Kubernetes : Jenkins-X on EKS
    138. Docker & Kubernetes : ArgoCD App of Apps with Heml on Kubernetes
    139. Docker & Kubernetes : ArgoCD on Kubernetes cluster
    140. Docker & Kubernetes : GitOps with ArgoCD for Continuous Delivery to Kubernetes clusters (minikube) - guestbook



    Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization

    YouTubeMy YouTube channel

    Sponsor Open Source development activities and free contents for everyone.

    Thank you.

    - K Hong







    Docker & K8s



    Docker install on Amazon Linux AMI

    Docker install on EC2 Ubuntu 14.04

    Docker container vs Virtual Machine

    Docker install on Ubuntu 14.04

    Docker Hello World Application

    Nginx image - share/copy files, Dockerfile

    Working with Docker images : brief introduction

    Docker image and container via docker commands (search, pull, run, ps, restart, attach, and rm)

    More on docker run command (docker run -it, docker run --rm, etc.)

    Docker Networks - Bridge Driver Network

    Docker Persistent Storage

    File sharing between host and container (docker run -d -p -v)

    Linking containers and volume for datastore

    Dockerfile - Build Docker images automatically I - FROM, MAINTAINER, and build context

    Dockerfile - Build Docker images automatically II - revisiting FROM, MAINTAINER, build context, and caching

    Dockerfile - Build Docker images automatically III - RUN

    Dockerfile - Build Docker images automatically IV - CMD

    Dockerfile - Build Docker images automatically V - WORKDIR, ENV, ADD, and ENTRYPOINT

    Docker - Apache Tomcat

    Docker - NodeJS

    Docker - NodeJS with hostname

    Docker Compose - NodeJS with MongoDB

    Docker - Prometheus and Grafana with Docker-compose

    Docker - StatsD/Graphite/Grafana

    Docker - Deploying a Java EE JBoss/WildFly Application on AWS Elastic Beanstalk Using Docker Containers

    Docker : NodeJS with GCP Kubernetes Engine

    Docker : Jenkins Multibranch Pipeline with Jenkinsfile and Github

    Docker : Jenkins Master and Slave

    Docker - ELK : ElasticSearch, Logstash, and Kibana

    Docker - ELK 7.6 : Elasticsearch on Centos 7 Docker - ELK 7.6 : Filebeat on Centos 7

    Docker - ELK 7.6 : Logstash on Centos 7

    Docker - ELK 7.6 : Kibana on Centos 7 Part 1

    Docker - ELK 7.6 : Kibana on Centos 7 Part 2

    Docker - ELK 7.6 : Elastic Stack with Docker Compose

    Docker - Deploy Elastic Cloud on Kubernetes (ECK) via Elasticsearch operator on minikube

    Docker - Deploy Elastic Stack via Helm on minikube

    Docker Compose - A gentle introduction with WordPress

    Docker Compose - MySQL

    MEAN Stack app on Docker containers : micro services

    Docker Compose - Hashicorp's Vault and Consul Part A (install vault, unsealing, static secrets, and policies)

    Docker Compose - Hashicorp's Vault and Consul Part B (EaaS, dynamic secrets, leases, and revocation)

    Docker Compose - Hashicorp's Vault and Consul Part C (Consul)

    Docker Compose with two containers - Flask REST API service container and an Apache server container

    Docker compose : Nginx reverse proxy with multiple containers

    Docker compose : Nginx reverse proxy with multiple containers

    Docker & Kubernetes : Envoy - Getting started

    Docker & Kubernetes : Envoy - Front Proxy

    Docker & Kubernetes : Ambassador - Envoy API Gateway on Kubernetes

    Docker Packer

    Docker Cheat Sheet

    Docker Q & A

    Kubernetes Q & A - Part I

    Kubernetes Q & A - Part II

    Docker - Run a React app in a docker

    Docker - Run a React app in a docker II (snapshot app with nginx)

    Docker - NodeJS and MySQL app with React in a docker

    Docker - Step by Step NodeJS and MySQL app with React - I

    Installing LAMP via puppet on Docker

    Docker install via Puppet

    Nginx Docker install via Ansible

    Apache Hadoop CDH 5.8 Install with QuickStarts Docker

    Docker - Deploying Flask app to ECS

    Docker Compose - Deploying WordPress to AWS

    Docker - WordPress Deploy to ECS with Docker-Compose (ECS-CLI EC2 type)

    Docker - ECS Fargate

    Docker - AWS ECS service discovery with Flask and Redis

    Docker & Kubernetes: minikube version: v1.31.2, 2023

    Docker & Kubernetes 1 : minikube

    Docker & Kubernetes 2 : minikube Django with Postgres - persistent volume

    Docker & Kubernetes 3 : minikube Django with Redis and Celery

    Docker & Kubernetes 4 : Django with RDS via AWS Kops

    Docker & Kubernetes : Kops on AWS

    Docker & Kubernetes : Ingress controller on AWS with Kops

    Docker & Kubernetes : HashiCorp's Vault and Consul on minikube

    Docker & Kubernetes : HashiCorp's Vault and Consul - Auto-unseal using Transit Secrets Engine

    Docker & Kubernetes : Persistent Volumes & Persistent Volumes Claims - hostPath and annotations

    Docker & Kubernetes : Persistent Volumes - Dynamic volume provisioning

    Docker & Kubernetes : DaemonSet

    Docker & Kubernetes : Secrets

    Docker & Kubernetes : kubectl command

    Docker & Kubernetes : Assign a Kubernetes Pod to a particular node in a Kubernetes cluster

    Docker & Kubernetes : Configure a Pod to Use a ConfigMap

    AWS : EKS (Elastic Container Service for Kubernetes)

    Docker & Kubernetes : Run a React app in a minikube

    Docker & Kubernetes : Minikube install on AWS EC2

    Docker & Kubernetes : Cassandra with a StatefulSet

    Docker & Kubernetes : Terraform and AWS EKS

    Docker & Kubernetes : Pods and Service definitions

    Docker & Kubernetes : Headless service and discovering pods

    Docker & Kubernetes : Service IP and the Service Type

    Docker & Kubernetes : Kubernetes DNS with Pods and Services

    Docker & Kubernetes - Scaling and Updating application

    Docker & Kubernetes : Horizontal pod autoscaler on minikubes

    Docker & Kubernetes : NodePort vs LoadBalancer vs Ingress

    Docker & Kubernetes : Load Testing with Locust on GCP Kubernetes

    Docker & Kubernetes : From a monolithic app to micro services on GCP Kubernetes

    Docker & Kubernetes : Rolling updates

    Docker & Kubernetes : Deployments to GKE (Rolling update, Canary and Blue-green deployments)

    Docker & Kubernetes : Slack Chat Bot with NodeJS on GCP Kubernetes

    Docker & Kubernetes : Continuous Delivery with Jenkins Multibranch Pipeline for Dev, Canary, and Production Environments on GCP Kubernetes

    Docker & Kubernetes - MongoDB with StatefulSets on GCP Kubernetes Engine

    Docker & Kubernetes : Nginx Ingress Controller on minikube

    Docker & Kubernetes : Setting up Ingress with NGINX Controller on Minikube (Mac)

    Docker & Kubernetes : Nginx Ingress Controller for Dashboard service on Minikube

    Docker & Kubernetes : Nginx Ingress Controller on GCP Kubernetes

    Docker & Kubernetes : Kubernetes Ingress with AWS ALB Ingress Controller in EKS

    Docker & Kubernetes : MongoDB / MongoExpress on Minikube

    Docker & Kubernetes : Setting up a private cluster on GCP Kubernetes

    Docker & Kubernetes : Kubernetes Namespaces (default, kube-public, kube-system) and switching namespaces (kubens)

    Docker & Kubernetes : StatefulSets on minikube

    Docker & Kubernetes : StatefulSets on minikube

    Docker & Kubernetes : RBAC

    Docker & Kubernetes Service Account, RBAC, and IAM

    Docker & Kubernetes - Kubernetes Service Account, RBAC, IAM with EKS ALB, Part 1

    Docker & Kubernetes : Helm Chart

    Docker & Kubernetes : My first Helm deploy

    Docker & Kubernetes : Readiness and Liveness Probes

    Docker & Kubernetes : Helm chart repository with Github pages

    Docker & Kubernetes : Deploying WordPress and MariaDB with Ingress to Minikube using Helm Chart

    Docker & Kubernetes : Deploying WordPress and MariaDB to AWS using Helm 2 Chart

    Docker & Kubernetes : Deploying WordPress and MariaDB to AWS using Helm 3 Chart

    Docker & Kubernetes : Helm Chart for Node/Express and MySQL with Ingress

    Docker & Kubernetes : Docker_Helm_Chart_Node_Expess_MySQL_Ingress.php

    Docker & Kubernetes: Deploy Prometheus and Grafana using Helm and Prometheus Operator - Monitoring Kubernetes node resources out of the box

    Docker & Kubernetes : Deploy Prometheus and Grafana using kube-prometheus-stack Helm Chart

    Docker & Kubernetes : Istio (service mesh) sidecar proxy on GCP Kubernetes

    Docker & Kubernetes : Istio on EKS

    Docker & Kubernetes : Istio on Minikube with AWS EC2 for Bookinfo Application

    Docker & Kubernetes : Deploying .NET Core app to Kubernetes Engine and configuring its traffic managed by Istio (Part I)

    Docker & Kubernetes : Deploying .NET Core app to Kubernetes Engine and configuring its traffic managed by Istio (Part II - Prometheus, Grafana, pin a service, split traffic, and inject faults)

    Docker & Kubernetes : Helm Package Manager with MySQL on GCP Kubernetes Engine

    Docker & Kubernetes : Deploying Memcached on Kubernetes Engine

    Docker & Kubernetes : EKS Control Plane (API server) Metrics with Prometheus

    Docker & Kubernetes : Spinnaker on EKS with Halyard

    Docker & Kubernetes : Continuous Delivery Pipelines with Spinnaker and Kubernetes Engine

    Docker & Kubernetes: Multi-node Local Kubernetes cluster - Kubeadm-dind(docker-in-docker)

    Docker & Kubernetes: Multi-node Local Kubernetes cluster - Kubeadm-kind(k8s-in-docker)

    Docker & Kubernetes : nodeSelector, nodeAffinity, taints/tolerations, pod affinity and anti-affinity - Assigning Pods to Nodes

    Docker & Kubernetes : Jenkins-X on EKS

    Docker & Kubernetes : ArgoCD App of Apps with Heml on Kubernetes

    Docker & Kubernetes : ArgoCD on Kubernetes cluster

    Docker & Kubernetes : GitOps with ArgoCD for Continuous Delivery to Kubernetes clusters (minikube) - guestbook




    Sponsor Open Source development activities and free contents for everyone.

    Thank you.

    - K Hong







    Ansible 2.0



    What is Ansible?

    Quick Preview - Setting up web servers with Nginx, configure environments, and deploy an App

    SSH connection & running commands

    Ansible: Playbook for Tomcat 9 on Ubuntu 18.04 systemd with AWS

    Modules

    Playbooks

    Handlers

    Roles

    Playbook for LAMP HAProxy

    Installing Nginx on a Docker container

    AWS : Creating an ec2 instance & adding keys to authorized_keys

    AWS : Auto Scaling via AMI

    AWS : creating an ELB & registers an EC2 instance from the ELB

    Deploying Wordpress micro-services with Docker containers on Vagrant box via Ansible

    Setting up Apache web server

    Deploying a Go app to Minikube

    Ansible with Terraform





    Terraform



    Introduction to Terraform with AWS elb & nginx

    Terraform Tutorial - terraform format(tf) and interpolation(variables)

    Terraform Tutorial - user_data

    Terraform Tutorial - variables

    Terraform 12 Tutorial - Loops with count, for_each, and for

    Terraform Tutorial - creating multiple instances (count, list type and element() function)

    Terraform Tutorial - State (terraform.tfstate) & terraform import

    Terraform Tutorial - Output variables

    Terraform Tutorial - Destroy

    Terraform Tutorial - Modules

    Terraform Tutorial - Creating AWS S3 bucket / SQS queue resources and notifying bucket event to queue

    Terraform Tutorial - AWS ASG and Modules

    Terraform Tutorial - VPC, Subnets, RouteTable, ELB, Security Group, and Apache server I

    Terraform Tutorial - VPC, Subnets, RouteTable, ELB, Security Group, and Apache server II

    Terraform Tutorial - Docker nginx container with ALB and dynamic autoscaling

    Terraform Tutorial - AWS ECS using Fargate : Part I

    Hashicorp Vault

    HashiCorp Vault Agent

    HashiCorp Vault and Consul on AWS with Terraform

    Ansible with Terraform

    AWS IAM user, group, role, and policies - part 1

    AWS IAM user, group, role, and policies - part 2

    Delegate Access Across AWS Accounts Using IAM Roles

    AWS KMS

    terraform import & terraformer import

    Terraform commands cheat sheet

    Terraform Cloud

    Terraform 14

    Creating Private TLS Certs





    DevOps



    Phases of Continuous Integration

    Software development methodology

    Introduction to DevOps

    Samples of Continuous Integration (CI) / Continuous Delivery (CD) - Use cases

    Artifact repository and repository management

    Linux - General, shell programming, processes & signals ...

    RabbitMQ...

    MariaDB

    New Relic APM with NodeJS : simple agent setup on AWS instance

    Nagios on CentOS 7 with Nagios Remote Plugin Executor (NRPE)

    Nagios - The industry standard in IT infrastructure monitoring on Ubuntu

    Zabbix 3 install on Ubuntu 14.04 & adding hosts / items / graphs

    Datadog - Monitoring with PagerDuty/HipChat and APM

    Install and Configure Mesos Cluster

    Cassandra on a Single-Node Cluster

    Container Orchestration : Docker Swarm vs Kubernetes vs Apache Mesos

    OpenStack install on Ubuntu 16.04 server - DevStack

    AWS EC2 Container Service (ECS) & EC2 Container Registry (ECR) | Docker Registry

    CI/CD with CircleCI - Heroku deploy

    Introduction to Terraform with AWS elb & nginx

    Docker & Kubernetes

    Kubernetes I - Running Kubernetes Locally via Minikube

    Kubernetes II - kops on AWS

    Kubernetes III - kubeadm on AWS

    AWS : EKS (Elastic Container Service for Kubernetes)

    CI/CD Github actions

    CI/CD Gitlab



    DevOps / Sys Admin Q & A



    (1A) - Linux Commands

    (1B) - Linux Commands

    (2) - Networks

    (2B) - Networks

    (3) - Linux Systems

    (4) - Scripting (Ruby/Shell)

    (5) - Configuration Management

    (6) - AWS VPC setup (public/private subnets with NAT)

    (6B) - AWS VPC Peering

    (7) - Web server

    (8) - Database

    (9) - Linux System / Application Monitoring, Performance Tuning, Profiling Methods & Tools

    (10) - Trouble Shooting: Load, Throughput, Response time and Leaks

    (11) - SSH key pairs, SSL Certificate, and SSL Handshake

    (12) - Why is the database slow?

    (13) - Is my web site down?

    (14) - Is my server down?

    (15) - Why is the server sluggish?

    (16A) - Serving multiple domains using Virtual Hosts - Apache

    (16B) - Serving multiple domains using server block - Nginx

    (16C) - Reverse proxy servers and load balancers - Nginx

    (17) - Linux startup process

    (18) - phpMyAdmin with Nginx virtual host as a subdomain

    (19) - How to SSH login without password?

    (20) - Log Rotation

    (21) - Monitoring Metrics

    (22) - lsof

    (23) - Wireshark introduction

    (24) - User account management

    (25) - Domain Name System (DNS)

    (26) - NGINX SSL/TLS, Caching, and Session

    (27) - Troubleshooting 5xx server errors

    (28) - Linux Systemd: journalctl

    (29) - Linux Systemd: FirewallD

    (30) - Linux: SELinux

    (31) - Linux: Samba

    (0) - Linux Sys Admin's Day to Day tasks





    Jenkins



    Install

    Configuration - Manage Jenkins - security setup

    Adding job and build

    Scheduling jobs

    Managing_plugins

    Git/GitHub plugins, SSH keys configuration, and Fork/Clone

    JDK & Maven setup

    Build configuration for GitHub Java application with Maven

    Build Action for GitHub Java application with Maven - Console Output, Updating Maven

    Commit to changes to GitHub & new test results - Build Failure

    Commit to changes to GitHub & new test results - Successful Build

    Adding code coverage and metrics

    Jenkins on EC2 - creating an EC2 account, ssh to EC2, and install Apache server

    Jenkins on EC2 - setting up Jenkins account, plugins, and Configure System (JAVA_HOME, MAVEN_HOME, notification email)

    Jenkins on EC2 - Creating a Maven project

    Jenkins on EC2 - Configuring GitHub Hook and Notification service to Jenkins server for any changes to the repository

    Jenkins on EC2 - Line Coverage with JaCoCo plugin

    Setting up Master and Slave nodes

    Jenkins Build Pipeline & Dependency Graph Plugins

    Jenkins Build Flow Plugin

    Pipeline Jenkinsfile with Classic / Blue Ocean

    Jenkins Setting up Slave nodes on AWS

    Jenkins Q & A





    Puppet



    Puppet with Amazon AWS I - Puppet accounts

    Puppet with Amazon AWS II (ssh & puppetmaster/puppet install)

    Puppet with Amazon AWS III - Puppet running Hello World

    Puppet Code Basics - Terminology

    Puppet with Amazon AWS on CentOS 7 (I) - Master setup on EC2

    Puppet with Amazon AWS on CentOS 7 (II) - Configuring a Puppet Master Server with Passenger and Apache

    Puppet master /agent ubuntu 14.04 install on EC2 nodes

    Puppet master post install tasks - master's names and certificates setup,

    Puppet agent post install tasks - configure agent, hostnames, and sign request

    EC2 Puppet master/agent basic tasks - main manifest with a file resource/module and immediate execution on an agent node

    Setting up puppet master and agent with simple scripts on EC2 / remote install from desktop

    EC2 Puppet - Install lamp with a manifest ('puppet apply')

    EC2 Puppet - Install lamp with a module

    Puppet variable scope

    Puppet packages, services, and files

    Puppet packages, services, and files II with nginx Puppet templates

    Puppet creating and managing user accounts with SSH access

    Puppet Locking user accounts & deploying sudoers file

    Puppet exec resource

    Puppet classes and modules

    Puppet Forge modules

    Puppet Express

    Puppet Express 2

    Puppet 4 : Changes

    Puppet --configprint

    Puppet with Docker

    Puppet 6.0.2 install on Ubuntu 18.04





    Chef



    What is Chef?

    Chef install on Ubuntu 14.04 - Local Workstation via omnibus installer

    Setting up Hosted Chef server

    VirtualBox via Vagrant with Chef client provision

    Creating and using cookbooks on a VirtualBox node

    Chef server install on Ubuntu 14.04

    Chef workstation setup on EC2 Ubuntu 14.04

    Chef Client Node - Knife Bootstrapping a node on EC2 ubuntu 14.04





    Elasticsearch search engine, Logstash, and Kibana



    Elasticsearch, search engine

    Logstash with Elasticsearch

    Logstash, Elasticsearch, and Kibana 4

    Elasticsearch with Redis broker and Logstash Shipper and Indexer

    Samples of ELK architecture

    Elasticsearch indexing performance



    Vagrant



    VirtualBox & Vagrant install on Ubuntu 14.04

    Creating a VirtualBox using Vagrant

    Provisioning

    Networking - Port Forwarding

    Vagrant Share

    Vagrant Rebuild & Teardown

    Vagrant & Ansible





    Big Data & Hadoop Tutorials



    Hadoop 2.6 - Installing on Ubuntu 14.04 (Single-Node Cluster)

    Hadoop 2.6.5 - Installing on Ubuntu 16.04 (Single-Node Cluster)

    Hadoop - Running MapReduce Job

    Hadoop - Ecosystem

    CDH5.3 Install on four EC2 instances (1 Name node and 3 Datanodes) using Cloudera Manager 5

    CDH5 APIs

    QuickStart VMs for CDH 5.3

    QuickStart VMs for CDH 5.3 II - Testing with wordcount

    QuickStart VMs for CDH 5.3 II - Hive DB query

    Scheduled start and stop CDH services

    CDH 5.8 Install with QuickStarts Docker

    Zookeeper & Kafka Install

    Zookeeper & Kafka - single node single broker

    Zookeeper & Kafka - Single node and multiple brokers

    OLTP vs OLAP

    Apache Hadoop Tutorial I with CDH - Overview

    Apache Hadoop Tutorial II with CDH - MapReduce Word Count

    Apache Hadoop Tutorial III with CDH - MapReduce Word Count 2

    Apache Hadoop (CDH 5) Hive Introduction

    CDH5 - Hive Upgrade to 1.3 to from 1.2

    Apache Hive 2.1.0 install on Ubuntu 16.04

    Apache HBase in Pseudo-Distributed mode

    Creating HBase table with HBase shell and HUE

    Apache Hadoop : Hue 3.11 install on Ubuntu 16.04

    Creating HBase table with Java API

    HBase - Map, Persistent, Sparse, Sorted, Distributed and Multidimensional

    Flume with CDH5: a single-node Flume deployment (telnet example)

    Apache Hadoop (CDH 5) Flume with VirtualBox : syslog example via NettyAvroRpcClient

    List of Apache Hadoop hdfs commands

    Apache Hadoop : Creating Wordcount Java Project with Eclipse Part 1

    Apache Hadoop : Creating Wordcount Java Project with Eclipse Part 2

    Apache Hadoop : Creating Card Java Project with Eclipse using Cloudera VM UnoExample for CDH5 - local run

    Apache Hadoop : Creating Wordcount Maven Project with Eclipse

    Wordcount MapReduce with Oozie workflow with Hue browser - CDH 5.3 Hadoop cluster using VirtualBox and QuickStart VM

    Spark 1.2 using VirtualBox and QuickStart VM - wordcount

    Spark Programming Model : Resilient Distributed Dataset (RDD) with CDH

    Apache Spark 2.0.2 with PySpark (Spark Python API) Shell

    Apache Spark 2.0.2 tutorial with PySpark : RDD

    Apache Spark 2.0.0 tutorial with PySpark : Analyzing Neuroimaging Data with Thunder

    Apache Spark Streaming with Kafka and Cassandra

    Apache Spark 1.2 with PySpark (Spark Python API) Wordcount using CDH5

    Apache Spark 1.2 Streaming

    Apache Drill with ZooKeeper install on Ubuntu 16.04 - Embedded & Distributed

    Apache Drill - Query File System, JSON, and Parquet

    Apache Drill - HBase query

    Apache Drill - Hive query

    Apache Drill - MongoDB query





    Redis In-Memory Database



    Redis vs Memcached

    Redis 3.0.1 Install

    Setting up multiple server instances on a Linux host

    Redis with Python

    ELK : Elasticsearch with Redis broker and Logstash Shipper and Indexer



    GCP (Google Cloud Platform)



    GCP: Creating an Instance

    GCP: gcloud compute command-line tool

    GCP: Deploying Containers

    GCP: Kubernetes Quickstart

    GCP: Deploying a containerized web application via Kubernetes

    GCP: Django Deploy via Kubernetes I (local)

    GCP: Django Deploy via Kubernetes II (GKE)





    AWS (Amazon Web Services)



    AWS : EKS (Elastic Container Service for Kubernetes)

    AWS : Creating a snapshot (cloning an image)

    AWS : Attaching Amazon EBS volume to an instance

    AWS : Adding swap space to an attached volume via mkswap and swapon

    AWS : Creating an EC2 instance and attaching Amazon EBS volume to the instance using Python boto module with User data

    AWS : Creating an instance to a new region by copying an AMI

    AWS : S3 (Simple Storage Service) 1

    AWS : S3 (Simple Storage Service) 2 - Creating and Deleting a Bucket

    AWS : S3 (Simple Storage Service) 3 - Bucket Versioning

    AWS : S3 (Simple Storage Service) 4 - Uploading a large file

    AWS : S3 (Simple Storage Service) 5 - Uploading folders/files recursively

    AWS : S3 (Simple Storage Service) 6 - Bucket Policy for File/Folder View/Download

    AWS : S3 (Simple Storage Service) 7 - How to Copy or Move Objects from one region to another

    AWS : S3 (Simple Storage Service) 8 - Archiving S3 Data to Glacier

    AWS : Creating a CloudFront distribution with an Amazon S3 origin

    AWS : Creating VPC with CloudFormation

    WAF (Web Application Firewall) with preconfigured CloudFormation template and Web ACL for CloudFront distribution

    AWS : CloudWatch & Logs with Lambda Function / S3

    AWS : Lambda Serverless Computing with EC2, CloudWatch Alarm, SNS

    AWS : Lambda and SNS - cross account

    AWS : CLI (Command Line Interface)

    AWS : CLI (ECS with ALB & autoscaling)

    AWS : ECS with cloudformation and json task definition

    AWS : AWS Application Load Balancer (ALB) and ECS with Flask app

    AWS : Load Balancing with HAProxy (High Availability Proxy)

    AWS : VirtualBox on EC2

    AWS : NTP setup on EC2

    AWS: jq with AWS

    AWS : AWS & OpenSSL : Creating / Installing a Server SSL Certificate

    AWS : OpenVPN Access Server 2 Install

    AWS : VPC (Virtual Private Cloud) 1 - netmask, subnets, default gateway, and CIDR

    AWS : VPC (Virtual Private Cloud) 2 - VPC Wizard

    AWS : VPC (Virtual Private Cloud) 3 - VPC Wizard with NAT

    AWS : DevOps / Sys Admin Q & A (VI) - AWS VPC setup (public/private subnets with NAT)

    AWS : OpenVPN Protocols : PPTP, L2TP/IPsec, and OpenVPN

    AWS : Autoscaling group (ASG)

    AWS : Setting up Autoscaling Alarms and Notifications via CLI and Cloudformation

    AWS : Adding a SSH User Account on Linux Instance

    AWS : Windows Servers - Remote Desktop Connections using RDP

    AWS : Scheduled stopping and starting an instance - python & cron

    AWS : Detecting stopped instance and sending an alert email using Mandrill smtp

    AWS : Elastic Beanstalk with NodeJS

    AWS : Elastic Beanstalk Inplace/Rolling Blue/Green Deploy

    AWS : Identity and Access Management (IAM) Roles for Amazon EC2

    AWS : Identity and Access Management (IAM) Policies, sts AssumeRole, and delegate access across AWS accounts

    AWS : Identity and Access Management (IAM) sts assume role via aws cli2

    AWS : Creating IAM Roles and associating them with EC2 Instances in CloudFormation

    AWS Identity and Access Management (IAM) Roles, SSO(Single Sign On), SAML(Security Assertion Markup Language), IdP(identity provider), STS(Security Token Service), and ADFS(Active Directory Federation Services)

    AWS : Amazon Route 53

    AWS : Amazon Route 53 - DNS (Domain Name Server) setup

    AWS : Amazon Route 53 - subdomain setup and virtual host on Nginx

    AWS Amazon Route 53 : Private Hosted Zone

    AWS : SNS (Simple Notification Service) example with ELB and CloudWatch

    AWS : Lambda with AWS CloudTrail

    AWS : SQS (Simple Queue Service) with NodeJS and AWS SDK

    AWS : Redshift data warehouse

    AWS : CloudFormation - templates, change sets, and CLI

    AWS : CloudFormation Bootstrap UserData/Metadata

    AWS : CloudFormation - Creating an ASG with rolling update

    AWS : Cloudformation Cross-stack reference

    AWS : OpsWorks

    AWS : Network Load Balancer (NLB) with Autoscaling group (ASG)

    AWS CodeDeploy : Deploy an Application from GitHub

    AWS EC2 Container Service (ECS)

    AWS EC2 Container Service (ECS) II

    AWS Hello World Lambda Function

    AWS Lambda Function Q & A

    AWS Node.js Lambda Function & API Gateway

    AWS API Gateway endpoint invoking Lambda function

    AWS API Gateway invoking Lambda function with Terraform

    AWS API Gateway invoking Lambda function with Terraform - Lambda Container

    Amazon Kinesis Streams

    Kinesis Data Firehose with Lambda and ElasticSearch

    Amazon DynamoDB

    Amazon DynamoDB with Lambda and CloudWatch

    Loading DynamoDB stream to AWS Elasticsearch service with Lambda

    Amazon ML (Machine Learning)

    Simple Systems Manager (SSM)

    AWS : RDS Connecting to a DB Instance Running the SQL Server Database Engine

    AWS : RDS Importing and Exporting SQL Server Data

    AWS : RDS PostgreSQL & pgAdmin III

    AWS : RDS PostgreSQL 2 - Creating/Deleting a Table

    AWS : MySQL Replication : Master-slave

    AWS : MySQL backup & restore

    AWS RDS : Cross-Region Read Replicas for MySQL and Snapshots for PostgreSQL

    AWS : Restoring Postgres on EC2 instance from S3 backup

    AWS : Q & A

    AWS : Security

    AWS : Security groups vs. network ACLs

    AWS : Scaling-Up

    AWS : Networking

    AWS : Single Sign-on (SSO) with Okta

    AWS : JIT (Just-in-Time) with Okta





    Powershell 4 Tutorial



    Powersehll : Introduction

    Powersehll : Help System

    Powersehll : Running commands

    Powersehll : Providers

    Powersehll : Pipeline

    Powersehll : Objects

    Powershell : Remote Control

    Windows Management Instrumentation (WMI)

    How to Enable Multiple RDP Sessions in Windows 2012 Server

    How to install and configure FTP server on IIS 8 in Windows 2012 Server

    How to Run Exe as a Service on Windows 2012 Server

    SQL Inner, Left, Right, and Outer Joins





    Git/GitHub Tutorial



    One page express tutorial for GIT and GitHub

    Installation

    add/status/log

    commit and diff

    git commit --amend

    Deleting and Renaming files

    Undoing Things : File Checkout & Unstaging

    Reverting commit

    Soft Reset - (git reset --soft <SHA key>)

    Mixed Reset - Default

    Hard Reset - (git reset --hard <SHA key>)

    Creating & switching Branches

    Fast-forward merge

    Rebase & Three-way merge

    Merge conflicts with a simple example

    GitHub Account and SSH

    Uploading to GitHub

    GUI

    Branching & Merging

    Merging conflicts

    GIT on Ubuntu and OS X - Focused on Branching

    Setting up a remote repository / pushing local project and cloning the remote repo

    Fork vs Clone, Origin vs Upstream

    Git/GitHub Terminologies

    Git/GitHub via SourceTree II : Branching & Merging

    Git/GitHub via SourceTree III : Git Work Flow

    Git/GitHub via SourceTree IV : Git Reset

    Git wiki - quick command reference






    Subversion

    Subversion Install On Ubuntu 14.04

    Subversion creating and accessing I

    Subversion creating and accessing II








    Contact

    BogoToBogo
    contactus@bogotobogo.com

    Follow Bogotobogo

    About Us

    contactus@bogotobogo.com

    YouTubeMy YouTube channel
    Pacific Ave, San Francisco, CA 94115

    Pacific Ave, San Francisco, CA 94115

    Copyright © 2024, bogotobogo
    Design: Web Master