A book could be written on the subject, but to boil it down to 3 areas: It is not as good at being a data store as some other options like MongoDB, Hadoop, etc. For smaller use cases, it will perform fine. If you are streaming TB’s of data every day, you will find that..
Category: NoSQL Database – HBase, Kafka, Cassandra, Elasticsearch, Solr
Elasticsearch is the most popular, open-source, cross-platform, distributed and scalable search-engine based on Lucene. It is written in Java and released under the terms of the Apache License. Elasticsearch is developed alongside Logstash and Kibana. Logstash is a data-collection and log-parsing engine while Kibana is an analytics and visualization platform. The three products combined are referred to as the Elastic Stack (formerly known as the ELK stack). They are..
In HBase, tables are split into regions and are served by the region servers. Regions are vertically divided by column families into “Stores”. Stores are saved as files in HDFS. Shown below is the architecture of HBase. Note: The term ‘store’ is used for regions to explain the storage structure. HBase has three major components:..
Thanks for A2A. This is analogous to question “ Why do we need database when we have data warehouse” HBase is for OLTP and hive for OLAP Let me give you simple example : Have you ever edited /updated Facebook comments on particular post. This update is done in real time , that’s where HBase comes in to..
apache Kafka & Storm. Both Kafka and storm integrate very well to form a real-time ecosystem. The current day industry is emerging lots of real-time streaming data there need to be processed in real time. The below are some of the examples. Hundreds of sensors get placed around a machinery to know the health of the..
In short, search and log analytics. Elastic has been implemented in lots of instances that require very fast large-scale search such as Stack Overflow, Soundcloud, Rijksmuseum in Amsterdam, Fog Creek (40 billion lines of code), Verizon Business (50 billion documents), etc. The ELK stack, (Elastic, Logstash, Kibana) is now developed enough to use it for..
Solr is an open-source search platform which is used to build search applications. It was built on top of Lucene (full text search engine). Solr is enterprise-ready, fast and highly scalable. The applications built using Solr are sophisticated and deliver high performance. It was Yonik Seely who created Solr in 2004 in order to add search capabilities to the company website..
Apache Solr is an enterprise-capable, open source search platform based on the Apache Lucene search library. Solr is probably the most widely deployed search platform, world-wide. Solr is written in Java, and provides both a RESTful XML interface, and a JSON API with which search applications can be built. Solr enjoys a reputation for being..
Apache Cassandra is a highly scalable, high-performance distributed database designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. It is a type of NoSQL database. A NoSQL database (sometimes called as Not Only SQL) is a database that provides a mechanism to store and retrieve..