Re: Minimizing backup induced downtime
- From: Alexander Skwar <alexander@xxxxxxxxxx>
- Date: Thu, 12 Jul 2007 16:14:19 +0200
fitzjarrell@xxxxxxx <fitzjarrell@xxxxxxx> wrote:
Comments embedded.
That's the way I like it best ;)
On Jul 12, 6:33 am, Alexander Skwar <alexan...@xxxxxxxxxx> wrote:
1) Shut down application, which uses Oracle as a backend
2) Shut down Oracle
3) Create filesystem snapshots with ZFS on Solaris 10
4) Start backup to tape of filesystem snapshots. When done, remove
snapshots 5) Startup Oracle
6) Startup application
The step 4 is run in the background. Because of this, the actual
downtime is very small (a matter of seconds. At most, it's 1 minute).
I'm currently shutting down everything, so that the files on backup are
in a consistent state.
Everything? Or everything application-related? I think it's the
latter, but please clarify this.
Yes, it's the latter. I shutdown "everything application related".
Ie. I'm doing "/etc/init.d/teamcenter stop ; /etc/init.d/oracle stop".
This shuts down "everything".
Sorry. I should have been clearer.
I'm now thinking about how to fit RMAN into this picture. I think it
might look something like this:
1) Shut down application, which uses Oracle as a backend
2) Have RMAN create backup of database
3) Create filesystem snapshots with ZFS on Solaris 10
4) Start backup to tape of filesystem snapshots. When done, remove
snapshots 5) Startup application
Here, step 3 is in background. But I cannot start step 3, before step 2
is actually finished. Because of this, the downtime of the application
will be larger, won't it?
Yes, since you are disallowing access to the database during the
backup, which isn't necessary for RMAN to complete its job.
Yes, as far as just the "simple" Oracle side is concerned, I could
leave Oracle running while RMAN runs (or isn't it rather, that I
actually NEED to leave Oracle running?). That's understood.
But you
do list valid concerns further down the post.
Granted, I don't have to shutdown the
application, but then I don't know that I'm in a consistent state.
Suppose that I do not shutdown the app while RMAN is running. Then a user
comes, adds/deletes/modifies something. This modification is then, of
course, not part of the backup I'm doing at this run.
That depends upon when the update/insert/delete occurs and on which
table/tablespace is affected.
Ja. Nein. :) I'm actually not so much concerned about the Oracle
side of things. One thing I forgot to mention is, that I've got
two "databases" (or maybe I haven't emphasized this point strong
enough):
- There's the Oracle database
- and a file storage (a so called "vault")
The vault is managed by the application. It stores files there
and also stores "pointers" to these files in some table in Oracle.
The vault is essentially just a directory (with some subdirs).
BTW: Changing the application is *not* possible. At least it's not
realistical.
The problem is, that users could, through the application, delete
and maybe even modify files, which are stored on this directory.
So, suppose the following timeline:
|-------------------|-----------------|-------|-->
1 2 3 4
At t1, I start RMAN backup. RMAN backup is finished at t3. At t4, I'd
create the snapshots (what I called step 3 at my previous list of tasks).
Now at t2, a user comes in and deletes or even worse (IMO) modifies
a file. The backup starts essentially at t4. At this point in time,
the database "says" that there should be a file xyz.doc with such
and such contents, but in reality, the file is either gone (user deleted
it) or it has a different content. Granted, the chance of that happening
is close to zero at my setup, as the backup is done at a time of night,
when "usually" nobody works. But there's just the slim chance of something
like this happening.
Anyway. I'd like to get to a point, where it's (close to) impossible
for a user to modify files while the backup is running.
Hm.
Maybe my problem is, that the time difference between t1 and t4 is
too large. Maybe it would be better for me, to do:
1) Take snapshot of vault
2) Start RMAN backup
Step 1) takes a split second. So there's of course the odd chance that
somebody hops in at point "1.5", but I think that's very unlikely to
happen (as I wrote already, I'm doing the backup at a time of night,
where the system is not used by users - "normally").
Hm. Maybe that's the way to go. The advantage would then be, that there's
no downtime at all for doing the backup. However, recovery will take
longer and will be much more complicated than with a simple Shell Script
based backup. Hm. With some good documentation on how to rebuild a system
in case of a crash, that shouldn't be such big a problem.
Even worse - the FS snapshotNow you have a situation where possibly what you are currently doing
doesn't "harmonize" with what's in the DB (the application
creates/deletes files in a so called "vault", which is some directory on
the server).
is a better strategy for your environment than using RMAN would be.
It's at least a more simple approach. IMO. Not as far as taking the
backup of Oracle is concerned. But Oracle is just one piece in this
puzzle. And as far as business is concerned, it's not even the most
important one - the vault is more important.
Restoring your current snapshots would provide a working database;
restoring snapshots created by your second scenario would require a
restore and recover operation through RMAN, consuming more time.
Yep. And as far as the business is concerned, there'll be no advantage
in the RMAN based backup. At least none that's relevant (downtime in
the middle of the night is acceptable - especially as it is a downtime
of just a few seconds).
Because of all of this, I'd like to shutdown the application while RMAN
is running.
And that's fine, however I have noted some issues with your proposed
backup strategy using RMAN which you won't have with your current
method.
Thanks.
In such a scenario/setup, wouldn't RMAN make the downtime larger?
Yes, both on the backup side and on the recovery side.
Oh. Valid point from you. I didn't even consider recovery yet. With
my current setup, all I need to do is a full restore in a case of a
full disaster (ie. server burned down). That's quite a simple task,
as I just need to restore everything and I'm directly back in a working
state.
Thanks a lot for taking the time to read and post a helpful response.
I appreciate it!
Alexander Skwar
.
- References:
- Minimizing backup induced downtime
- From: Alexander Skwar
- Re: Minimizing backup induced downtime
- From: fitzjarrell@xxxxxxx
- Minimizing backup induced downtime
- Prev by Date: Re: Timestamp Fractional Seconds
- Next by Date: Re: Minimizing backup induced downtime
- Previous by thread: Re: Minimizing backup induced downtime
- Next by thread: Re: Minimizing backup induced downtime
- Index(es):
Relevant Pages
|