Archive for the ‘databases’ Category

Tarski and Codd

November 25, 2015

Wikipedia says that “Relational calculus is essentially equivalent to first-order logic, and indeed, Codd’s Theorem had been known to logicians since the late 1940s.”  I couldn’t find the cited sources online, but did find these interesting papers:

Applications of Alfred Tarski’s Ideas in Database Theory

and

Tarski’s influence on computer science.” (see the section starting “The final thing I want to tell something about is the connection of Tarski’s ideas and work with database theory.”)

If you’ve studied mathematical logic (or math, eg, topology), you are probably familiar with Tarski’s name.  The historical development of mathematical logic and relational database theory is an interesting topic I’d like to understand better.    Codd’s 1970 paper is fairly readable, and this 1972 paper is related to the correspondences between the different approaches taken.

semi-confused about semi-relational

January 30, 2014

I’m reading a 2012 google paper about Spanner.  I saw Spanner described somewhere as “semi-relational” and wanted to read more.  The paper I’m reading is “Spanner: Google’s Globally-Distributed Database” .

Early on, in page 4, is this paragraph:

“Spanner’s data model is not purely relational, in that  rows must have names. More precisely, every table is required to have an ordered set of one or more primary-key columns. This requirement is where Spanner still looks like a key-value store: the primary keys form the name for a row, and each table defines a mapping from the primary-key columns to the non-primary-key columns.  A row has existence only if some value (even if it is NULL) is defined for the row’s keys. Imposing this structure is useful because it lets applications control data locality through their choices of keys.”

This made no sense to me.  It’s not purely relational because every table needs a primary key?  In relational theory, every relation is required to have at least one candidate key.  Is this confusion between “logical” relational theory and current implementations that allow duplicate rows in tables?  Maybe because it’s an *ordered* set, is that the point?

To me, the not-relational part to me sounds like the fact that primary keys can include NULLs.

Or are they really referring to the fact that data are grouped somewhat hierarchically?  (As explained later in the paper.)  That would make more sense to me.

Anyway, those first three sentences confuse me.  But I’m new to a lot of this.  I’m just a simple caveman.  Your modern ways confuse me.  I’m not arguing that Spanner is purely relational, just saying that I don’t get those first three sentences.  Maybe someone can explain them to me.

database musings (“deep thoughts”)

November 18, 2012

I have a soft spot for hierarchical databases.  My first database-related job was programming in M/Mumps.  I know the standard history of databases says that hierarchical databases are a relic of the past, and that, thanks to Codd, relational databases solve many of the problems of hierarchical (and other kinds of) databases.  I like relational databases – I was an Oracle DBA, I’ve worked with DB2, Sybase, Postgres, mSQL, others, and now MySQL.  I really like InnoDB.  However, I am occasionally sad that hierachical databases seem a thing of the past.
Or are they?  Yesterday I had a thought that hierarchical databases are much more widely used than relational databases.  In fact, maybe every single computer has a hierarchical database that is used by every computer user, whether they have database software installed or not.  The file system!  Isn’t that a hierarchical database?  The idea made me feel better.  🙂