Big data can be a big challenge for enterprises. Many organizations have built environments for transactional data with Relational Database Management Systems (RDBMS), but are now overwelmed with enormous amounts of new types of data. These include, but are not limited to, social media data, server logs, clickstream data, machine data, and geolocation data. These new data sources include unstructured and semi-structured data, and they all share the common big data characteristics of volume, velocity, and variety.
Relational databases came into wide use in the 1980s. These table based (column and row) relational databases became popular because they provided many useful features:
Relational databases are used to process and store structured data, and they offer many tools, but also a significant amount of infrastructure and overhead. A relational database simply cannot meet all of the business requirements and opportunities that have arisen with big data. The biggest limitation is that you are bound to a rigid schema ahead of time.
Using NoSQL does not necessarily involve scrapping your existing RDBMS and starting from scratch. NoSQL should be thought of as a tool that can be used to solve the new types of challenges associated with big data. There may be business processes that can continue to be addressed effectively with RDBMS. But with the new challenges presented by big data, you will likely face new problems that can be solved more efficiently – and more cost-effectively – with a NoSQL database. A NoSQL database can be used to solve new problems that require:
• Scalability – A NoSQL database can scale horizontally to the scale required by big data. Applications can run in parallel on a cloud-based cluster comprising of dozens, hundreds, or even thousands of commodity servers. The NoSQL scale-out architecture enables web applications to scale dynamically up to thousands or even millions of users. As the number of concurrent users grows, you can dynamically add more cluster nodes (commodity servers) to process the additional load.
• Flexibility – A schema-less NoSQL database can process and store structured, unstructured and semi-structured data, and enable flexible and rapid development of applications and use cases such as right-time decisioning, recommendations, profile management, bidding, and risk profiling. With RDBMS, you cannot add data (columns) without updating the entire schema, which can take hours or even days, depending on your institutional infrastructure. With NoSQL, data can be added and updated flexibly and efficiently.
• Speed – The parallel processing nature of a NoSQL database, along with caching and aggregation, can provide fast (sub-millisecond) transaction response times at scale.
• Developer productivity and agility – A NoSQL database shortens time to market for new applications and updates because developers don’t have to shoehorn data into a fixed schema. With NoSQL, applications can be rapidly prototyped, tested, and deployed into production in a cloud-based cluster.
• Operational readiness – In contrast to RDBMS, which typically require schema migration scripts, significant manual effort, and scheduled downtime for release upgrades, NoSQL schema migrations are relatively easy and have minimal impact on operational readiness. Maintenance windows are compressed, and the performance impact on users is far less noticeable.