Skip to main content

Elastic Search : Brief Note

ElasticSearch : is a distributed Restful search and analytics Engine, which is used to provide Enterprise search, Observability and Security all under with the same stack. It gives Availability and Partition tolerance of CAP theorem and Consistency can be achieved with multiple tricks.

The elastic stack ELK is :

  1. Elastic Search
  2. Integrations(Log stash)
  3. Kibana (UI dashboard to visualise and manage the data)
Elastic search is based on Lucene Library (open source) which is written in JAVA. The data can be stored anywhere, private or public or hybrid cloud. The elastic search uses sharding(primary and replica shards for documents), Index and run as multiple instances(Nodes) as a cluster to provide HA and Partition tolerance. 

The search queries in elastic search can be easy once as well as complicated.

To create a quick index:
PUT /bookdb_index
    { "settings": { "number_of_shards": 1 }}

A quick example of a simple JSON query is :
GET /bookdb_index/book/_search?q=guide

 where book is Type and the query is a search having q field as guide.

One can also use Kibana UI to visualise the data and play with the data using UI queries.

All the details on query Json format can be found here.

There is so much to know about ES and much more to explore of which internet is full. So for now this brief info will be good. 

Some take away points on Elastic Search:

  • No Sql Datastore
  • Json based
  • Used Restfully
  • Generally used for exponentially increasing large data like Logs, Matrices, App Trace data.

See ya in the next one. :)


Extras:
* Offerings from ES (7.2 onwards) has data temperatures based on the performance tradeoffs, where customers can decide how data is distributed based on indices. For example newer data is Hot (for lets say 7 days) and then Warm(next 30 days) and then Cold(for next 6 months) and frozen(then on).

The performance declines as the temperature decreases and so does the pricings.


* ES For Log analysis:
  1.  One can have time based indices, and sharding.
  2.  Different nodes can be used for read and write to handle heavy traffic.

* Nomenclature RDBMS vs NoSQL:  




Comments