<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Versioning MySQL data</title>
	<atom:link href="http://www.jasny.net/articles/versioning-mysql-data/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.jasny.net/articles/versioning-mysql-data/</link>
	<description>Helping you out with PHP &#38; MySQL</description>
	<lastBuildDate>Wed, 25 Jan 2012 21:57:47 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: Arnold Daniels</title>
		<link>http://www.jasny.net/articles/versioning-mysql-data/comment-page-1/#comment-336963</link>
		<dc:creator>Arnold Daniels</dc:creator>
		<pubDate>Thu, 13 Oct 2011 09:44:37 +0000</pubDate>
		<guid isPermaLink="false">http://www.adaniels.nl/?p=291#comment-336963</guid>
		<description>Hi Brian,

Thanks for commenting :)

MySQL (and other RDBMS) don&#039;t support versioning natively, which is a pain. This means that you have to do it yourself.

If we look at the most basic versioning system, it saves a copy of a file (or in this case record) whenever the file is changed. Since there can only be one version of a file on a (standard) file system, a copy needs to be saved elsewhere. We&#039;re doing the same with MySQL (so no shoe horning here).

Using LVM is a great way to get a snapshot of the complete database at a certain time. However, this is very much a sysadmin tool. You can&#039;t (easily) use it to show a user (the manager of a company for instance) the history of a record and allow him to roll back changes.

I strongly consider reusing a field for a different purpose to be a bad practise. Away from versioning, let&#039;s say you have an old piece of code that you&#039;ve forgotten about (or a college has written way back), reverencing this field, than it will reek havoc when updating the data incorrectly.

Instead you should always create a new column (or rename the old one), when giving it a new purpose.

The revisioning table should have all columns ever used, even if they no longer exist in the main table. At times that the field did not exists value will simply be NULL.

Note that this is only a solution for versioning the data and not for versioning the db structure. You should do that separately and treat the revisioning tables like any other table.</description>
		<content:encoded><![CDATA[<p>Hi Brian,</p>
<p>Thanks for commenting <img src='http://www.jasny.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>MySQL (and other RDBMS) don&#8217;t support versioning natively, which is a pain. This means that you have to do it yourself.</p>
<p>If we look at the most basic versioning system, it saves a copy of a file (or in this case record) whenever the file is changed. Since there can only be one version of a file on a (standard) file system, a copy needs to be saved elsewhere. We&#8217;re doing the same with MySQL (so no shoe horning here).</p>
<p>Using LVM is a great way to get a snapshot of the complete database at a certain time. However, this is very much a sysadmin tool. You can&#8217;t (easily) use it to show a user (the manager of a company for instance) the history of a record and allow him to roll back changes.</p>
<p>I strongly consider reusing a field for a different purpose to be a bad practise. Away from versioning, let&#8217;s say you have an old piece of code that you&#8217;ve forgotten about (or a college has written way back), reverencing this field, than it will reek havoc when updating the data incorrectly.</p>
<p>Instead you should always create a new column (or rename the old one), when giving it a new purpose.</p>
<p>The revisioning table should have all columns ever used, even if they no longer exist in the main table. At times that the field did not exists value will simply be NULL.</p>
<p>Note that this is only a solution for versioning the data and not for versioning the db structure. You should do that separately and treat the revisioning tables like any other table.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Brian</title>
		<link>http://www.jasny.net/articles/versioning-mysql-data/comment-page-1/#comment-336961</link>
		<dc:creator>Brian</dc:creator>
		<pubDate>Thu, 13 Oct 2011 02:30:05 +0000</pubDate>
		<guid isPermaLink="false">http://www.adaniels.nl/?p=291#comment-336961</guid>
		<description>Right now its not really possible to version your data in a RDBMS.  You are kind of shoe horning a versioning system on your data model here, because in an RDMS you can only have one version of your data at a time.  The best alternative you can have is an LVM snapshot, or some other mountable copy of your data at some point int time.  the larger issue you have is that you really cannot separate data from how you interpret it.  What every field means and how it used is important as it forms intent and information at that point in time, which is not getting covered in your system. Perhaps if you tied your revision number to the repository tag?</description>
		<content:encoded><![CDATA[<p>Right now its not really possible to version your data in a RDBMS.  You are kind of shoe horning a versioning system on your data model here, because in an RDMS you can only have one version of your data at a time.  The best alternative you can have is an LVM snapshot, or some other mountable copy of your data at some point int time.  the larger issue you have is that you really cannot separate data from how you interpret it.  What every field means and how it used is important as it forms intent and information at that point in time, which is not getting covered in your system. Perhaps if you tied your revision number to the repository tag?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Hari K T</title>
		<link>http://www.jasny.net/articles/versioning-mysql-data/comment-page-1/#comment-291077</link>
		<dc:creator>Hari K T</dc:creator>
		<pubDate>Sat, 09 Oct 2010 16:29:03 +0000</pubDate>
		<guid isPermaLink="false">http://www.adaniels.nl/?p=291#comment-291077</guid>
		<description>Nice . Thanks for sharing . It also helped me to know more about trigger.</description>
		<content:encoded><![CDATA[<p>Nice . Thanks for sharing . It also helped me to know more about trigger.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Arnold Daniels</title>
		<link>http://www.jasny.net/articles/versioning-mysql-data/comment-page-1/#comment-210900</link>
		<dc:creator>Arnold Daniels</dc:creator>
		<pubDate>Thu, 26 Nov 2009 00:54:14 +0000</pubDate>
		<guid isPermaLink="false">http://www.adaniels.nl/?p=291#comment-210900</guid>
		<description>I understand, it&#039;s a race condition. Also actions in the trigger are not send to the slaves, instead they invoke their own trigger. I&#039;ll address in my next post.</description>
		<content:encoded><![CDATA[<p>I understand, it&#8217;s a race condition. Also actions in the trigger are not send to the slaves, instead they invoke their own trigger. I&#8217;ll address in my next post.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Baron</title>
		<link>http://www.jasny.net/articles/versioning-mysql-data/comment-page-1/#comment-210658</link>
		<dc:creator>Baron</dc:creator>
		<pubDate>Sat, 21 Nov 2009 03:01:42 +0000</pubDate>
		<guid isPermaLink="false">http://www.adaniels.nl/?p=291#comment-210658</guid>
		<description>Arnold, the problem is not conceptual.  The problem is specifically with MySQL&#039;s triggers and binary logging (replication).  It is not occasionally going to fail, it&#039;s basically just going to fail and your slave will end up with different data, or duplicate key errors, or you won&#039;t be able to do point-in-time recovery with binary logs.  See http://www.mysqlperformanceblog.com/2008/09/29/why-audit-logging-with-triggers-in-mysql-is-bad-for-replication/</description>
		<content:encoded><![CDATA[<p>Arnold, the problem is not conceptual.  The problem is specifically with MySQL&#8217;s triggers and binary logging (replication).  It is not occasionally going to fail, it&#8217;s basically just going to fail and your slave will end up with different data, or duplicate key errors, or you won&#8217;t be able to do point-in-time recovery with binary logs.  See <a href="http://www.mysqlperformanceblog.com/2008/09/29/why-audit-logging-with-triggers-in-mysql-is-bad-for-replication/" rel="nofollow">http://www.mysqlperformanceblog.com/2008/09/29/why-audit-logging-with-triggers-in-mysql-is-bad-for-replication/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Log Buffer</title>
		<link>http://www.jasny.net/articles/versioning-mysql-data/comment-page-1/#comment-210645</link>
		<dc:creator>Log Buffer</dc:creator>
		<pubDate>Fri, 20 Nov 2009 19:45:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.adaniels.nl/?p=291#comment-210645</guid>
		<description>&quot;From Arnold Daniels comes a version of versioning MySQL data, which Arnold introduces thus: [...]&quot;

&lt;a href=&quot;http://www.pythian.com/news/5567/log-buffer-170-a-carnival-of-the-vanities-for-dbas&quot; rel=&quot;nofollow&quot;&gt;Log Buffer #170&lt;/a&gt;</description>
		<content:encoded><![CDATA[<p>&#8220;From Arnold Daniels comes a version of versioning MySQL data, which Arnold introduces thus: [...]&#8221;</p>
<p><a href="http://www.pythian.com/news/5567/log-buffer-170-a-carnival-of-the-vanities-for-dbas" rel="nofollow">Log Buffer #170</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Arnold Daniels</title>
		<link>http://www.jasny.net/articles/versioning-mysql-data/comment-page-1/#comment-210403</link>
		<dc:creator>Arnold Daniels</dc:creator>
		<pubDate>Mon, 16 Nov 2009 12:20:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.adaniels.nl/?p=291#comment-210403</guid>
		<description>&lt;strong&gt;&lt;em&gt;Jens:&lt;/em&gt;&lt;/strong&gt; This article shows the basics of this method. I do have a solution for revisioning data across multiple tables. I&#039;ll will discuss that in a follow article which I&#039;ll write this week. That solution also has its limits, so I agree that this solution is not universally applicable. Again, more on that later.

This method is not mutually exclusive with an audit trail. I don&#039;t think it is usually worth to save a diff of the change. The alternative is save a copy of each record, which is what this revisioning system does. The audit trail table can than simply look like
&lt;pre lang=&quot;SQL&quot;&gt;
CREATE TABLE `audit_trail` (
  `table` varchar(255) NOT NULL,
  `id` int(10) unsigned NOT NULL,
  `revision` bigint(20) unsigned DEFAULT NULL,
  `user_id` int(10) unsigned DEFAULT NULL,
  `timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  KEY `id` (`table`,`id`,`revision`),
  KEY `revision` (`table`,`revision`),
  KEY `user_id` (`user_id`)
  KEY `timestamp` (`timestamp`),
)
&lt;/pre&gt;
This could replace the history tables. Do note that your forced in using surrogate keys for all revisioned tables.

You should still make a periodic backups, even when using this method. Those backups are basically snapshots.

As you already state in your article, versioned objects highly complicate the logic of the application layer. I don&#039;t like that, since that is usually the place that is already complex, shouldn&#039;t be over abstracted and where most bugs appear.</description>
		<content:encoded><![CDATA[<p><strong><em>Jens:</em></strong> This article shows the basics of this method. I do have a solution for revisioning data across multiple tables. I&#8217;ll will discuss that in a follow article which I&#8217;ll write this week. That solution also has its limits, so I agree that this solution is not universally applicable. Again, more on that later.</p>
<p>This method is not mutually exclusive with an audit trail. I don&#8217;t think it is usually worth to save a diff of the change. The alternative is save a copy of each record, which is what this revisioning system does. The audit trail table can than simply look like</p>
<pre lang="SQL">
CREATE TABLE `audit_trail` (
  `table` varchar(255) NOT NULL,
  `id` int(10) unsigned NOT NULL,
  `revision` bigint(20) unsigned DEFAULT NULL,
  `user_id` int(10) unsigned DEFAULT NULL,
  `timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  KEY `id` (`table`,`id`,`revision`),
  KEY `revision` (`table`,`revision`),
  KEY `user_id` (`user_id`)
  KEY `timestamp` (`timestamp`),
)
</pre>
<p>This could replace the history tables. Do note that your forced in using surrogate keys for all revisioned tables.</p>
<p>You should still make a periodic backups, even when using this method. Those backups are basically snapshots.</p>
<p>As you already state in your article, versioned objects highly complicate the logic of the application layer. I don&#8217;t like that, since that is usually the place that is already complex, shouldn&#8217;t be over abstracted and where most bugs appear.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jens Schauder</title>
		<link>http://www.jasny.net/articles/versioning-mysql-data/comment-page-1/#comment-210329</link>
		<dc:creator>Jens Schauder</dc:creator>
		<pubDate>Sun, 15 Nov 2009 06:13:04 +0000</pubDate>
		<guid isPermaLink="false">http://www.adaniels.nl/?p=291#comment-210329</guid>
		<description>When collecting history of data one should carefully think about the expected use cases. For example, the presented approach of a version per table becomes a nightmare when you try to find a consistent set of data spanning multple tables. 

It is also less then optimal when trying to analyze what kind of changes happen on the database.

Another question often asked is: what happend at a certain time in the system, which probably would be better answered by some kind of audit trail.

So before you implement an approach like this, I&#039;d recommend thinking about some different options.

I wrote a german article about this some time ago: http://blog.schauderhaft.de/2008/09/14/versionierte-vs-historisierte-objekte/</description>
		<content:encoded><![CDATA[<p>When collecting history of data one should carefully think about the expected use cases. For example, the presented approach of a version per table becomes a nightmare when you try to find a consistent set of data spanning multple tables. </p>
<p>It is also less then optimal when trying to analyze what kind of changes happen on the database.</p>
<p>Another question often asked is: what happend at a certain time in the system, which probably would be better answered by some kind of audit trail.</p>
<p>So before you implement an approach like this, I&#8217;d recommend thinking about some different options.</p>
<p>I wrote a german article about this some time ago: <a href="http://blog.schauderhaft.de/2008/09/14/versionierte-vs-historisierte-objekte/" rel="nofollow">http://blog.schauderhaft.de/2008/09/14/versionierte-vs-historisierte-objekte/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Artur Ejsmont</title>
		<link>http://www.jasny.net/articles/versioning-mysql-data/comment-page-1/#comment-210239</link>
		<dc:creator>Artur Ejsmont</dc:creator>
		<pubDate>Fri, 13 Nov 2009 09:33:07 +0000</pubDate>
		<guid isPermaLink="false">http://www.adaniels.nl/?p=291#comment-210239</guid>
		<description>Hi there, 
Very nice article with a lot of details.

I used to work with similar solution on Postgres and it was really working well.

The coolest thing we had in our system was that versioning and history were completly separated from application and kept in db layer. The only thing application could use is to manipulate the CURRENT_VIEW_TIME value. Then views on all the tables would pick rows from real table or from history table so that you were transparently accessing data as it was exactly on CURRENT_VIEW_TIME.

We had begin and end time in every history row to make this calculation easier.

Suprisyngly it was working well and not really so slow either. It was the collest thing ever! :- )

But i agree once you add triggers and lots of them forget about replication ... well maybe row-level-replication would work for you.

Very nice article! thanks</description>
		<content:encoded><![CDATA[<p>Hi there,<br />
Very nice article with a lot of details.</p>
<p>I used to work with similar solution on Postgres and it was really working well.</p>
<p>The coolest thing we had in our system was that versioning and history were completly separated from application and kept in db layer. The only thing application could use is to manipulate the CURRENT_VIEW_TIME value. Then views on all the tables would pick rows from real table or from history table so that you were transparently accessing data as it was exactly on CURRENT_VIEW_TIME.</p>
<p>We had begin and end time in every history row to make this calculation easier.</p>
<p>Suprisyngly it was working well and not really so slow either. It was the collest thing ever! :- )</p>
<p>But i agree once you add triggers and lots of them forget about replication &#8230; well maybe row-level-replication would work for you.</p>
<p>Very nice article! thanks</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Arnold Daniels</title>
		<link>http://www.jasny.net/articles/versioning-mysql-data/comment-page-1/#comment-210215</link>
		<dc:creator>Arnold Daniels</dc:creator>
		<pubDate>Thu, 12 Nov 2009 22:29:38 +0000</pubDate>
		<guid isPermaLink="false">http://www.adaniels.nl/?p=291#comment-210215</guid>
		<description>I should also mention, that there trigger will slow down insert and update queries. So I&#039;ve you&#039;re doing a lot of writes, don&#039;t use this. Since the original table still contains the same data, this won&#039;t affect the speed of read (select) queries.</description>
		<content:encoded><![CDATA[<p>I should also mention, that there trigger will slow down insert and update queries. So I&#8217;ve you&#8217;re doing a lot of writes, don&#8217;t use this. Since the original table still contains the same data, this won&#8217;t affect the speed of read (select) queries.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
<!-- WP Super Cache is installed but broken. The path to wp-cache-phase1.php in wp-content/advanced-cache.php must be fixed! -->
