undo and redo

April 18, 2014

Here’s something that used to make my head spin a bit… A quick quiz: does undo generate redo? does redo generate undo?

When I heard that undo generated redo, it sent me for a loop. Undo is stored in the system tablespace (undo segments), as regular data, and therefore generated redo. Then I thought, OK, would redo generate undo? If so, we’re obviously in a vicious cycle. So, no. Why?

We need redo for undo so that if there is a crash, and some data has (have) been written to disk but not committed, and the undo wasn’t yet written to disk, we can recreate it from the redo, then use it to undo the uncommitted but written data. (Corrections if I’m inaccurate, please.  This is all from my head only.)

We don’t need undo for redo – if there is a crash, and a transaction was not fully completed, we need the undo to roll it back. And if we don’t need the redo, it’s not something we need to undo. The redo logs are special files outside of the database proper.

So, this may all be clear as mud from my writing, but I’ve always wanted to think it out loud.  Please correct me in the comments if I’m wrong on anything here.

semi-confused about semi-relational

January 30, 2014

I’m reading a 2012 google paper about Spanner.  I saw Spanner described somewhere as “semi-relational” and wanted to read more.  The paper I’m reading is “Spanner: Google’s Globally-Distributed Database” .

Early on, in page 4, is this paragraph:

“Spanner’s data model is not purely relational, in that  rows must have names. More precisely, every table is required to have an ordered set of one or more primary-key columns. This requirement is where Spanner still looks like a key-value store: the primary keys form the name for a row, and each table defines a mapping from the primary-key columns to the non-primary-key columns.  A row has existence only if some value (even if it is NULL) is defined for the row’s keys. Imposing this structure is useful because it lets applications control data locality through their choices of keys.”

This made no sense to me.  It’s not purely relational because every table needs a primary key?  In relational theory, every relation is required to have at least one candidate key.  Is this confusion between “logical” relational theory and current implementations that allow duplicate rows in tables?  Maybe because it’s an *ordered* set, is that the point?

To me, the not-relational part to me sounds like the fact that primary keys can include NULLs.

Or are they really referring to the fact that data are grouped somewhat hierarchically?  (As explained later in the paper.)  That would make more sense to me.

Anyway, those first three sentences confuse me.  But I’m new to a lot of this.  I’m just a simple caveman.  Your modern ways confuse me.  I’m not arguing that Spanner is purely relational, just saying that I don’t get those first three sentences.  Maybe someone can explain them to me.

MySQL for Beginners Training on Demand course, Day 3

December 14, 2012

It’s been a while since I posted about Day 1 and Day 2 of the MySQL for Beginners online Training on Demand course offered by Oracle (http://education.oracle.com).  Yesterday was 90 days from the first posting, so that’s how long you would have had to use this course!

Day 3 is a whirlwind tour of some further aspects of SQL.  Here’s what it covers:

Introduction to Data Manipulation Language (DML) in SQL
– introduces INSERT, REPLACE, UPDATE and DELETE statements and their syntax and uses

Intro to Functions
– string functions, temporal, numeric, control flow and aggregate functions

Intro to Joins
– describes cartesian, inner, and outer joins – their syntax and uses

Intro to Subqueries
– types and placements of subqueries
– how and when to convert to joins

This is intended for beginners, and if you know the material, you might look at the headings and think “I know this” and skip it.  Personally, I always like to review, and try to see what I do and don’t know.  While some of this material was covered at a high level, there was enough detail that I learned some things, and found several stepping-off points to research and learn more.  Here are some of my own notes on things I learned more about:

First, when I read about REPLACE, I did some tests to make sure I understood how it worked.  From the course, we know that “it is possible for a single row to replace more than one old row if the table contains multiple unique constraints.”  I haven’t used REPLACE in too much detail, so I created some tables with a primary key and two other unique, multi-column constraints, and did some playing around.  For example, I used a single-row REPLACE to get three rows replaced by the one new row.  (I also was able refresh my memory on how REPLACE works, by deleting then inserting entire rows, as opposed to UPDATE.)

I enjoyed the introduction to functions – I always find it good to review the built-in functions, and always find some functions I’d forgotten about, never really used, or didn’t even know existed.  I got practice on the syntax and usage of some of the less common functions.

For one of my stepping-off points, I did some more reading on ROUND() and how it deals with numbers ending in .5, depending on whether the datatype is an exact-value datatype, or an approximate-value datatype.  (You may notice a trend in my interests, if you read Day 2.)

Then, in the intro to joins, I found out, for example, that joining with “ON” is not exactly the same as joining with “USING”.  “USING” treats the join column as one, whereas “ON” treats it as two.  Here’s an example.  Table j1 has (1,’j1.1′),(2,’j1.2′) and table j2 has (1,’j2.1′),(2,’j2.2′).

mysql> SELECT * FROM j1 JOIN j2 USING (a);
+——+——+——+
| a    | b    | b    |
+——+——+——+
|    1 | j1.1 | j2.1 |
|    2 | j1.2 | j2.2 |
+——+——+——+
2 rows in set (0.00 sec)

mysql> SELECT * FROM j1 JOIN j2 ON (j1.a=j2.a);
+——+——+——+——+
| a    | b    | a    | b    |
+——+——+——+——+
|    1 | j1.1 |    1 | j2.1 |
|    2 | j1.2 |    2 | j2.2 |
+——+——+——+——+
2 rows in set (0.00 sec)

In the intro to subqueries, I reviewed correlated and uncorrelated subqueries, and where subqueries can be used (in the SELECT, the FROM or the WHERE).  Also, the possible result types (scalar, row, column, and table) and how the different result types can be used differently in queries (eg, for comparisons, in functions, etc).  These are fairly simple concepts, but somehow they keep slipping around in my mind, and reviewing them is very helpful to me.

So, the course covers a wide ground, but also gives enough detail to chew on and learn from.  Having it available for 90 days give a lot of good time for chewing.  I also find it useful in this regard that you get the streaming video of the classroom course, with captioning available, and the pdf student guide and activity guide – sometimes I prefer using one or the other, and for reviewing and “chewing”, the guides are particularly helpful.  This course covers a lot of material, and I’m glad I get to review it!  Stay tuned for day 4…

database musings (“deep thoughts”)

November 18, 2012

I have a soft spot for hierarchical databases.  My first database-related job was programming in M/Mumps.  I know the standard history of databases says that hierarchical databases are a relic of the past, and that, thanks to Codd, relational databases solve many of the problems of hierarchical (and other kinds of) databases.  I like relational databases – I was an Oracle DBA, I’ve worked with DB2, Sybase, Postgres, mSQL, others, and now MySQL.  I really like InnoDB.  However, I am occasionally sad that hierachical databases seem a thing of the past.
Or are they?  Yesterday I had a thought that hierarchical databases are much more widely used than relational databases.  In fact, maybe every single computer has a hierarchical database that is used by every computer user, whether they have database software installed or not.  The file system!  Isn’t that a hierarchical database?  The idea made me feel better.  :)

MySQL for Beginners Training on Demand course, Day 2

November 1, 2012

It’s been a while since my post on taking day 1 of this course; as promised, here are my notes and observations after taking Day 2.

First, this is one of the nice things about taking Training on Demand.  I could take Day 1, go about my other business, apply what I learned, etc, then, when ready, take Day 2, even a month or so later!  I was able to go back and review Day 1, then move on to Day 2, entirely at my own schedule.

I remembered something from earlier, and wanted to find it again, so I tried out the search capabilities.  I searched for the phrase I wanted, and up came links to the lessons it appeared in.  I could click on any of the results to view that lesson, and watch, or check the captioning to find the phrase I was looking for.  It worked perfectly!

Day 2 covers data types, basic SELECT queries, and how to create and maintain tables and databases (using DROP and ALTER, and how they work).  Again, I enjoy reviewing this kind of material.  And if it’s new, it will give you plenty to chew on and try out.

In my case, the discussion of data types sent me off on a tangent of my own, about floating-point data and rounding errors.  I was able to google and research, then get back to the course when I was done.  (By the way, I found a good link on the subject at at http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html – “What Every Computer Scientist Should Know About Floating-Point Arithmetic” – in case you’re interested.)  So, again, that’s a nice thing about Training on Demand – you can start, stop, go off on tangents, and resume when ready.

Activities in Training on Demand include quizzes to test your knowledge (with answers in the back of the activity guide, or given by the instructor in a separate video segment for each activity).

In summary, I enjoyed Day 2, and look forward to Days 3 and 4, and the bonus material.  Stay tuned!

Oh for the Wait Interface, part 3

October 31, 2012

To repeat myself (from “part 2″):

“a while back I wrote a post expressing a longing for wait-interface tuning”.

The original post, and “part 2″ were both written in 2007, while I was working for a startup in Pasadena, as a DBA.  Now I work for Oracle, supporting MySQL, and by now, MySQL has incorporated a wait interface.  So, to do justice, I am writing this “part 3″ and saying – it’s been done!

MySQL now has the performance schema, which instruments a vast number of events and waits.  The man page for the current GA version (5.5) is at http://dev.mysql.com/doc/refman/5.5/en/performance-schema.html .   It’s a big chapter, as the wait interface is highly configurable and versatile.

http://mysql.wisborg.dk/2012/09/03/what-is-the-mysql-performance-schema-and-why-is-it-needed/ gives a quick overview about the performance schema (“P_S”).  To get an idea of the wide variety of things instrumented, see, for example, http://www.markleith.co.uk/2012/07/24/a-mysql-replication-load-average-with-performance-schema/
and
https://blogs.oracle.com/svetasmirnova/entry/performance_schema_in_version_5 .

Because there is so much that can be configured with the performance schema (enabling the tracking of various events and waits), and because the queries to get the information out can be complicated, you might also want to check out Mark Leith’s collection of useful examples at http://www.markleith.co.uk/ps_helper/ .

So, there you go, now we have it!

(And now I have three posts on the subject, each with the heading capitalized a little differently, ’cause that’s the way I roll…)

Oracle Training on Demand – MySQL for Beginners, Day One

September 13, 2012

Today I took the first day of MySQL for Beginners from Oracle’s Training on Demand offerings.  From the course overview,

“This hands-on class covers the fundamentals of SQL and relational databases, using MySQL[tm] as a teaching tool.  The course is suitable for new or existing users of MySQL who have little or no previous experience with a relational database management system.”

(For more details on this course, and other training offerings and formats, go to http://education.oracle.com .)

Here are some of my opinions and observations:

The class came up very easily for me, no glitches or waiting. (I used Firefox today.  I’ll try IE tomorrow, then others.)

It feels a bit like being in an Oracle University classroom – you watch video of the instructor (Brent Dayley) giving the class.  Brent Dayley is a great instructor.  (And I’ve never had a bad instructor in Oracle classes, from my days before I worked for Oracle.)

There are tabs across the page, and one is for the “ekit”.  This contains the e-books (.pdf’s) you’d normally use in the classroom – a course book and an activity guide.  I suggest opening these before you start the course.  It’s great for reference.)

One nice thing is that you can pause, replay, skip back, etc.  I find that when I go to a classroom, I can only take in so much, and sometimes my mind wanders.  Here, I can pause, take a break or do some work, and come back to it.

It is a full course, with a lot of information.  The pace is fast on the first day, if you are new to RDBMSs.  There is a quick overview of what an RDBMS is, MySQL, installing and connecting, accessing data, and even data modelling and normal forms, all in day one.  So I paused sometimes to let it sink in.  (Usually by the end of a week-long class in a classroom, I’m just taking notes and planning to re-read it when I get home – which I never seem to find time to do.)

By the way, when you sign up, you get 90 days to use the training, but the labs in particular you schedule and use for 5 days.  So you might want to do the course in 5 days, then go back and review.  Or go through the course at your leisure, then go back and do the labs then review some more.  (Etc.)   Of course, you can also install MySQL on your own computer, download and install the sample databases (from http://dev.mysql.com/doc/index-other.html ) and do exercises and more at your leisure, too.

Going through the documentation and day one of the class was very convenient and full of information.  I’m excited to take and write about days 2, 3 and 4 over the coming weeks – stay tuned for more!

MySQL Training on Demand – online training classes from Oracle

August 31, 2012

It’s been a while since I’ve written any blog entries.  I like to think and write about basics and fundamentals, and in my work (Senior MySQL Support Engineer at Oracle), I can get caught up in particulars, so sometimes it’s hard to see the forest for the trees.

That’s why when we were offered a chance to try out Oracle’s Training on Demand (“ToD”) and take the new offering of “MySQL for Beginners”, I jumped at it.  I can go back to fundamentals, and now I can blog about the basics – in this case, the basics of Oracle’s Training-on-Demand – how it works, the experience, etc.

In my previous jobs as a DBA, I always enjoyed taking training classes (again, I like to focus on fundamentals).  I remember first (LONG ago) going to classes to learn linux.  (We felt initiated when, at the end, the instructor said “do a ps -ef and pipe it to grep for cron” and we all understood him.)  Later, I also took classes for being an Oracle DBA (then taking the first, “beta” certification exams at Oracle Open World and passing), then a MySQL DBA.  All of them were a welcome chance to take a break from the regular work day and go back to school.

Now Oracle Training on Demand offers a way to take such courses online, which offers many benefits, some of them beyond normal classroom training:

  •    you don’t need to wait for a class to be offered in your area
  •    the training is available for use for 90 days, 24×7
  •    online access to a student guide
  •    white-boarding and online labs, for hands-on experience
  •    Q&A available with an Oracle University Instructor
  •    you can start, stop, pause, skip or rewind as you like
  •    full-text search to find relevant spots in the video quickly

I am excited about this, because it can offer the best of a classroom and the best of a book.  Sometimes I learn best from a book, but I enjoy a classroom, too.  With a book, I can go at my own pace, put it down, pick it up, go back, look ahead, etc.  Same with ToD.  In a classroom, I can ask questions, or get hands-on lab practice.  Same with ToD.  And you get it for 90 days – not so with a classroom course.  The list goes on…

So I look forward to trying this out and blogging about the course content, and about how Oracle’s Training on Demand works.  In the meantime, you can read an announcement about this course at

https://blogs.oracle.com/MySQL/entry/all_new_mysql_for_beginners

or see the page for the course at

http://education.oracle.com/pls/web_prod-plq-dad/db_pages.getpage?page_id=557&get_params=courseId:D77226GC10,showLabTab:Y

More to come soon!

on NULL and NOT IN

February 27, 2010

I’ve been trying to think of something “big” to write about for so long, I haven’t written anything.  So I’ll write about something “small” that I found out the other day.

It turns out, ‘NOT IN’ and ‘NULL’ can have an odd (to me) effect.

Say you are doing something like

SELECT a FROM table1
WHERE a NOT IN
(SELECT a FROM table2);

If there are any NULLs in the table2’s a column, you will never get any results from this query.
Here’s an example:

– first, here are the two tables I used:

mysql> select * from test1;
+——+
| a    |
+——+
|    1 |
|    2 |
|    3 |
|    4 |
|    5 |
|    6 |
+——+
6 rows in set (0.02 sec)

mysql> select * from test2;
+——+
| a    |
+——+
|    1 |
|    2 |
|    3 |
| NULL |
+——+
4 rows in set (0.00 sec)

mysql> select a from test1 where a not in (select a from test2);
Empty set (0.02 sec)

– personally, I expected to get 4,5,6.  For that, you can do the following:

mysql> select a from test1 where a not in (select a from test2 where a is NOT NULL);
+——+
| a    |
+——+
|    4 |
|    5 |
|    6 |
+——+
3 rows in set (0.00 sec)

Why is this?  It is actually to follow the SQL standards on NULL.  As mentioned at http://dev.mysql.com/doc/refman/5.1/en/comparison-operators.html ,

“To comply with the SQL standard, IN returns NULL not only if the expression on the left hand side is NULL, but also if no match is found in the list and one of the expressions in the list is NULL.”

Just as “1=NULL” is undefined rather than false, and “1<>NULL” is too, “1 in (null)” is also undefined (or null), and “1 NOT IN (null) is null”.

mysql> select 1 in (null);
+————-+
| 1 in (null) |
+————-+
|        NULL |
+————-+
1 row in set (0.00 sec)

mysql> select 1 not in (null);
+—————–+
| 1 not in (null) |
+—————–+
|            NULL |
+—————–+
1 row in set (0.00 sec)

To simplify the examples with tables above, we can do the following, and see similar results.

mysql> select 1 not in (2);
+————–+
| 1 not in (2) |
+————–+
| 1            |
+————–+
1 row in set (0.00 sec)

mysql> select 1 not in (2,null);
+——————-+
| 1 not in (2,null) |
+——————-+
| NULL              |
+——————-+
1 row in set (0.00 sec)

Personally, once I read the reasoning, it made sense to me, but initially I was surprised.  I thought I’d write about it, because it is common to do a “SELECT … WHERE NOT IN (SELECT …)”, as in the example above with table1 and table2, and the results might not be what you expect if you have any NULLs in your table.

and back

April 20, 2009

to oracle…


Follow

Get every new post delivered to your Inbox.