Evolution of Databases Post Year 2000.

Data simply refers to raw facts that require analysis to procure information. A database is a set of information and a means to manipulate such data to make them useful. A database management system is a software that helps in defining, storing, manipulating and controlling data within a database. It is a set of related data and programs to access data in a much convenient way.

The relational database management model is founded on set theory. This was invented by Edgar Codd of IBM Research back in 1970. The main assumption of this model is data being represented as mathematical relations with some relation being a Cartesian product subset. Such a collection of tables is then given a unique identification. Rows within a table represent the relation among the different values. Previously we had object database management systems where data was simply stored as objects and records placed in named folders located in various directories.

Database modeling and databases have evolved closely together, their history can be traced to the 1960s. A lot of activity has happened post-2000 which began the graph web that was introduced with the semantic web stack from the world wide web consortium that happened in 1999, thereafter followed by property graphs in 2006. A major evolution was the NoSQL wave that introduced Big Data from 2008.

Previous databases majorly relied on structural query language as the only way to store and retrieve data from databases. The introduction of NoSQL meant an introduction of a newer technology and employed additional mechanisms to store and retrieve data. These databases existed previously but not famous or widely used as compared to post 2000, they were re-ignited by the Web 2.0 revolution.

Web 2.0 drifted from users receiving and ingesting content created by webmasters to user generated content. Major sites such as Facebook and Youtube were first to adopt this technology. It marked a shift of needs for developers and database administrators. A plethora of data was being introduced to the internet by users across the world almost on second by second basis. Cloud computing then brought in the idea of massive shared and dedicated storage as well as processing capabilities, further marking a change to databases.

The requirements of technology have also changed incredibly, drifting towards simplicity on scalability and design as a result of internet technologies. It has also been critical to have 99% availability and high access speeds. Conventional databases had challenges especially with the speed needed and scalability. NoSQL use of different data structures including graph, key-value, document, etc., it became incredibly fast. They also introduced the aspect of flexible databases as they eliminated the challenges that hindered conventional relational databases.

Nevertheless, NoSQL also came with short comings for instance inability to use joins across tables as well as lack of standardization. NoSQL also led to massive innovations around web and application development which made it easier and could handle the rapid growth around World Wide Web. Relational databases have also grew always having a place in technological growth, they now offer consistency, reliability and ease of programming for business systems.

In 2008, Apache and Neo4j’s Cypher language was perceived as the first property graph language for database implementations. This was initially deemed to be part of NoSQL but with what it has developed to today, it has earned a right to be called just as it is. Graph data modelling focuses of graph technologies, often called property graph but with some inclination to semantic graph capabilities.

Roger Mougalas in 2005, came up with the term Big Data to refer to a set of data that is tough to process and manage using conventional database management systems. Yahoo also created Hadoop the same year relying on Google’s MapReduce. The basis of Hadoop was to index the whole World Wide Web. Hadoop has now provided an open source platform to help organizations crunch through huge data sets. Given the growth of social media and advancements around Web 2.0, more data is generated rapidly. In 2009 the government of India opted to carry out fingerprint, an iris scan and photograph of all its citizens, a population of close to 1.2 billion persons.

Big Data which remains a new phenomenon stores different sets of structured, semi-structured and unstructured data. Structured data could be stored, processed or retrieved in a fixed format. Semi-structured data is data with both structured and unstructured data formats. Unstructured data does not have a fixed structure or format. Big data led to technologies like data mining and business intelligence which meant that even archived data could be used to draw insights that could help businesses project trends ahead, and improve to adopt that such trends indicate.

Such advanced databases are helping various business sectors including retail, education, health care, Banking among others. New database technologies are applied in Health care in areas such as administration, patient care, diagnosis management, and other details. This also includes supply chain management and customer relationship management. In the supply chain, it aids management of production flow of services and goods, while in customer relationship management it aids in the management of processes and workflows.

For the full paper including SQL Table creation and Data Insertion code Email rutofeli@gmail.com

Evolution of Databases Post Year 2000.

2 Comments