Mike's Notes
Kingsley Idehen posted on the Ontolog Forum yesterday about the ongoing discussion on Design Pattern Ontology. I have added the post below because it contains valuable links. The post also raised the subject of polyglot persistence, which uses different database storage techniques in complex software. I also added the Wikipedia article on the subject as a good intoduction.
Resources
- https://groups.google.com/g/ontolog-forum/c/RL97zvkBBMg/m/cYWdXU7qCwAJ
- https://en.wikipedia.org/wiki/Polyglot_persistence
Post by Kingsley Idehen
These days its just a simply prompt sent to the likes of Perplexity AI as per:
Or even ChatGPT as per:
Dump from ChatGPT.
Certainly! Below is the information about Michael Stonebraker and “Not Only SQL” (NoSQL) with relevant entities hyperlinked to their DBpedia IDs or Wikipedia pages where DBpedia links are unavailable.
Michael Stonebraker
- Who is he?
- Michael Stonebraker is a Turing Award-winning computer scientist known for his foundational contributions to database systems. He played a key role in developing several pioneering database systems, including:
- Ingres: One of the earliest relational database management systems (RDBMS).
- PostgreSQL: An advanced object-relational database system.
- Vertica: A columnar database optimized for analytical workloads.
- C-Store: A research prototype that inspired Vertica.
- H-Store: A main-memory distributed database that led to VoltDB.
- SciDB: A database designed for complex scientific data.
- Key Contributions:
- Advocate for specialized database systems tailored to specific use cases (e.g., OLTP, OLAP, scientific data).
- Vocal critic of “one-size-fits-all” database architectures, highlighting inefficiencies in applying traditional relational models to all workloads.
“Not Only SQL” (NoSQL)
- What is it?
- NoSQL refers to a family of non-relational databases designed for scalability, flexibility, and performance in use cases where traditional RDBMSs fall short. The term “Not Only SQL” emphasizes that these systems do not exclusively use SQL or the relational model.
- Key Characteristics of NoSQL:
- Schema-less or flexible schema.
- Horizontal scaling (e.g., distributed databases).
- Optimized for unstructured, semi-structured, or poly-structured data.
- Use cases often include web-scale applications, big data analytics, and real-time data processing.
- Popular NoSQL Databases:
- Document Stores (e.g., MongoDB, Couchbase)
- Key-Value Stores (e.g., Redis, Amazon DynamoDB)
- Wide-Column Stores (e.g., Apache Cassandra, Apache HBase)
- Graph Databases (e.g., Neo4j, ArangoDB)
Stonebraker’s Perspective on NoSQL
- Critique of NoSQL:
- Michael Stonebraker has been critical of NoSQL systems, particularly their lack of standardization, consistency guarantees, and durability in some implementations.
- He argues that SQL systems can evolve to meet modern demands without abandoning decades of database research on ACID properties and consistency.
- His work with systems like Vertica and VoltDB showcases that traditional SQL databases, when optimized, can outperform NoSQL systems for certain workloads.
- Support for Specialized Systems:
- While skeptical of NoSQL as a panacea, Stonebraker advocates for “one size does not fit all” in databases. He believes the right database architecture depends on the specific workload:
- SQL for OLTP and OLAP.
- Columnar systems for analytics.
- Distributed and in-memory systems for real-time processing.
- Graph databases for relationship-heavy workloads.
The Bigger Picture
The “Stonebraker vs. NoSQL” narrative reflects a broader debate in the database world:
- Relational vs. Non-Relational: Relational systems excel at structured data, while NoSQL systems are designed for flexibility and scalability.
- The Rise of Polyglot Persistence: Modern applications often combine multiple database types (e.g., relational for transactions, NoSQL for analytics, graph for relationships).
- Convergence of Features: Over time, SQL and NoSQL systems have borrowed features from each other:
- NoSQL systems are adding SQL-like query capabilities (e.g., SQL-like queries in MongoDB).
- SQL systems are adopting features like horizontal scaling and schema flexibility.
Takeaways
- Michael Stonebraker’s advocacy for specialized databases aligns with the ethos of NoSQL, even though he critiques its marketing and some implementations.
- The ongoing evolution of database systems emphasizes choosing the right tool for the job rather than adhering dogmatically to one paradigm.
- His work continues to influence the industry, bridging the gap between traditional RDBMSs and modern data needs.
Polyglot persistence
Polyglot persistence is a term that refers to using multiple data storage technologies within a single system, in order to meet varying data storage needs. Such a system may consist of multiple applications, or it may be a single application with smaller components.
Polyglot persistence shares its origins with how the term polyglot programming is defined. Polyglot programming is a term coined by Neal Ford in 2006, and expresses the idea that computer applications should be written in a mix of different programming languages, in order to take advantage of the fact that different languages are suitable for tackling different problems. Complex applications combine different types of problems, so picking the right language for each job may be more productive than trying to solve all aspects of the problem using a single language. This same concept can be applied to databases, that an application can communicate with different databases, using each for what it is best at to achieve an end goal, hence the term polyglot persistence.
Since there are numerous databases available to solve different problems, using a single database to satisfy all of a program's requirements can result in a non-performant, "jack of all trades, master of none" solution. Relational databases, for example, are good at enforcing relationships that exist between various data tables. To discover a relationship or to find data from different tables that belong to the same object, an SQL join operation can be used. This might work when the data is smaller in size, but becomes problematic when the data involved grows larger. A graph database might solve the problem of relationships in case of Big Data, but it might not solve the problem of database transactions, which may be provided better by relational database management systems. Instead, a NoSQL document database might be used to store unstructured data for that particular part of the problem. Thus different problems are solved by different database systems, all within the same application.
Some of such data storage technologies, but not limited to, could be
- Relational
- NoSQL
- Graph
- In-memory
No comments:
Post a Comment