By  digitalART2

Hadoop is the de facto standard for Big Data and HBase is the Hadoop database. 


Introduction of HBase

Traditional databases confine users to define a strict data structure in order to reduce disk usage, data tables would be joined again when users need bigger logical views, this would cause a real bottleneck in Big Data world if data tables are too big to store and to join again, even using sharded databases they still have scalability limits.

On the other hand, HBase allows open data structure, there is no type nor length boundary on any HBase column value, there is also no limit on the number of columns, a column family could have millions of columns, that means: You can store anything in HBase no matter what size with random access, auto-failover, automatic versioning, automatic sharding, extreme scalability, and strict data consistency!

  • The Hadoop Database: Out-of-box Hadoop integration
  • Open data structure: No type nor length limits on column values, no number of columns limits
  • Extreme Scalability: HBase can be expended from single node to tens of thousands nodes without much effort
  • Automatic Load-Balancing:  Automatically split big tables and re-distribute them as your data grows, by Automatic sharding
  • Fault Tolerance: Failure of a disk or a server is not a priority-one maintenance task!   If the cluster is big enough, failure of an entire rack of servers does not impact system functioning, because of the Automatic RegionServer Failover
  • Strict Consistency:  Data is consistent, every user sees the most recently updated data state, built-in Automatic Versioning
  • Supports massively parallelized processing (MPP)
  • No deadlocks-related pauses

You may not have petabytes of data that you need to analyze today, nevertheless, you can deploy HBase and Hadoop with confidence and can store anything in Hadoop and HBase, it is proven at scale because the user community of Hadoop and HBase is global, active and diverse, the success of the biggest Web companies in the world demonstrates that Hadoop can grow as your business does, companies across many industries participate, including financial services, social networking, media, telecommunications, retail, health care and others.  For more information, please read: Who uses HBase and Hadoop.


The way to start your HBase Project

Data-driven decisions and applications creates immense value from Big Data. There are 3 key steps to start a HBase project successfully:

  • Defining the problem domains and your business use cases: Start with an inventory of business challenges and goals, narrow them down to those expected to provide the highest return with HBase/Hadoop.
  • Defining the technical requirements: Determine the quality of your data in terms of volume, velocity, variety, identify how HBase and Hadoop will store and process the data
  • Planning the Big Data project: To construct concrete objectives and goals in measurable business terms, identify the time to business value, expected outcome. Plan project approach, cost by category, measures, project activity and timing.

Please feel free to contact us if you have any queries.

PostgreSQL, Open Source, database, Oracle, SQLServer, MYSQL