experchange > cobol

pete dashwood (03-21-19, 01:18 AM)
I'm sharing this experience in the hope you can profit from it.

A few weeks ago a main development machine suffered a hard drive crash.
This machine had over 12 million files on it...
To be fair, it was over 7 years old, and hard drives have an MTBF of
around 20,000 hours, so it shouldn't have been unexpected.

My first thought was that over 30 years worth of COBOL code was gone,
along with about 10 years worth of C# and some critical source for PRIMA
products. I knew there were backups (and later found that they were very
current for all the critical code) but it is like being hit with a
hammer and you are in a kind of shock that makes it hard to think straight.

I was devastated and completely de-motivated by it. (I actually
considered closing the Company and went into a depression for a couple
of days... :-))

Fortunately, the response, understanding, and support from clients
helped get me out of it. I decided to replace the machine and see if we
could get back to being productive.

You'd think it would really be pretty simple: Get the new machine, get
it configured and working, load everything from backup and away you go.

Such is not the case. There are SO many things that you take for granted
every day and it is not until disaster strikes that you realize you need
them.

The LAN has to accommodate and recognize the new machine and the other
machines on the LAN have to play nicely with it. The web server has to
be reachable and connected through FTP (ours is in Houston, Texas...we
are in Tauranga, NZ...), There are new versions of tools that you may
want to use (our main development used VS 2012, and SQL Server 2008;
could this be an opportunity to modernize?). What about the virtual
machines we use? (Fortunately, most of these are on external hard drives
so they weren't affected, but we still have to get them recognized and
working with the new server...)

What about mail and Office software? (Mail was the FIRST thing that had
to be restored and recovered so I could contact clients and let them
know what's happening.) Again, the disaster became an opportunity and I
got a really good deal on Office 2019 (don't like the subscription model
of Office 365, and the core code is exactly the same...). Outlook 2019
imported the mail accounts pretty well except that there were some
passwords I had forgotten (some of these accounts were very old...). It
took less time to hack the passwords than it would have to set the
accounts up again from scratch, but a lesson was learned and all the
passwords are safely stored now.

In the end, after 3 weeks full on, I am able to say that we can once
again support our products and provide proper cover to clients.

The transitions from VS 2012 to VS 2017 and from SQL Server 2008 to SQL
Server 2017 were pretty painless (but a lot of actual work).

All of the COBOL has been recovered. (A huge weight off my mind.)

I have used the restoration process to review and remove a lot of stuff
that we really don't use any more, and it has enabled me to target more
stuff that we still use but should have replaced by now.

The new machine has enabled me to come to grips with new features in
Windows (I like seeing a boot of Win 10 from cold take 12 seconds (SSD
for the OS) instead of 5 minutes, and being able to login biometrically
with a touch of my fingerprint... The new quad core, up to 4 GHZ
processor, is blazingly fast and, for example, a large Fujitsu NetCOBOL
for Windows compile that used to take around 20 seconds on a virtual
COBOL server, now takes 4 seconds on the same VM... The VM is just
getting much more power from the host.

But the main points I want to convey here are these:

1. Backup critical code EVERY DAY. (You know intellectually that you
should, but it becomes real when disaster strikes...)

2. Take a full image of the hard drive in critical servers at least
every 3 months. (I had such an image that was 6 weeks old and it was a
major help in the recovery.)

3. Keep calm and don't despair. (I wasted considerable time feeling
sorry for myself and it was only the support I was offered that kicked
me out of it...)

Recovery is still proceeding (it will go on for months as a background
task) but all of the critical services are up and running and in the
process of being base lined and having standard test plans run on them.

PRIMA will be officially back on the air as from Monday, 25th of March.

Cheers,

Pete.
Pascal J. Bourguignon (03-21-19, 01:51 AM)
pete dashwood <dashwood> writes:

> 1. Backup critical code EVERY DAY. (You know intellectually that you
> should, but it becomes real when disaster strikes...)
> 2. Take a full image of the hard drive in critical servers at least
> every 3 months. (I had such an image that was 6 weeks old and it was a
> major help in the recovery.)
> 3. Keep calm and don't despair. (I wasted considerable time feeling
> sorry for myself and it was only the support I was offered that kicked
> me out of it...)


Hard disks are cheap, you should backup more. Make it automatic, and
check that you can restore a working system from the backups easily and
periodically.

As for sources, use a remote git repository, then you will always have a
copy of your sources there, not everyday, but on each commit/push.
pete dashwood (03-21-19, 06:17 AM)
On 21/03/2019 12:51, Pascal J. Bourguignon wrote:
> pete dashwood <dashwood> writes:
> Hard disks are cheap, you should backup more. Make it automatic, and
> check that you can restore a working system from the backups easily and
> periodically.
> As for sources, use a remote git repository, then you will always have a
> copy of your sources there, not everyday, but on each commit/push.

I agree with backing up more often; the above is what I would consider
minimal.

I don't like automatic backup because I need to control what is being
backed up. However, I know people who do it and find it useful.

A remote git repository is fine for Open Source but too risky (IMO) for
commercial copyright code.

For the same reasons, I am not using the cloud... yet... :-) It is too
risky if the Internet goes out and we need our code.)

(probably will in the coming year when I am really sure it can be safely
guaranteed.)

Pete.
Pascal J. Bourguignon (03-21-19, 08:13 AM)
pete dashwood <dashwood> writes:

> A remote git repository is fine for Open Source but too risky (IMO)
> for commercial copyright code.


Of course, remote to your own server! You just need ssh and disk space.
Or you can also use the gitlab software which is open source, and
install it on your own server.

> For the same reasons, I am not using the cloud... yet... :-) It is too
> risky if the Internet goes out and we need our code.)


There's also OpenCloud.
Kerry Liles (03-21-19, 04:14 PM)
On 3/20/2019 7:18 PM, pete dashwood wrote:
[..]
> PRIMA will be officially back on the air as from Monday, 25th of March.
> Cheers,
> Pete.


Very glad to hear you are on the other side of this problem Pete. I can
relate to your experience - I have had to assist customers on several
occasions with much the same 'out of the blue' impromptu disaster
recovery and the look of horror on their faces when they realized that
they were 'taking backups but had no backups' was hard. I used that
phrase in a talk I gave at a user conference in the 80's. It was (and
likely still would be) astonishing how many people were making backups
but never really analyzed to see if the backups were functional!

Lessons learned that way are permanent. As I said, I am very glad to
hear you and Prima are on the road back.

PS:
I immediately thought about you when I heard about the atrocious mosque
attack - just because you are pretty much the only person I "know" in
NZ! I must say I have tremendous admiration for your elegant and amazing
prime minister. She is very inspirational and her compassion and caring
are a model for everyone.
pete dashwood (03-21-19, 11:49 PM)
On 22/03/2019 03:14, Kerry Liles wrote:
[..]
> NZ! I must say I have tremendous admiration for your elegant and amazing
> prime minister. She is very inspirational and her compassion and caring
> are a model for everyone.


Hi Kerry,

thanks for your positive post.

I'll quickly comment on the event that has rocked our country for the
last week. (I have made quite a few public posts about it in fora like
Quora, and was surprised by the insensitivity of some people who profess
to be Christians but are filled with hate for anyone not of their
persuasion... Fortunately, this is balanced by other Christians who take
a view closer to what their Teacher might have had...)

Some of us have known for a number of years that an attack like this
would be inevitable but we hoped it wouldn't happen.

New Zealand is not Paradise and as the population expands towards 5
million there are going to be a percentage of extremists of every
persuasion.

Nevertheless we have a "culture" here that is instilled in most of us
through childhood (in classrooms and on sports fields) that everyone
deserves a "fair go"... We take people as we find them and are generally
not influenced by someone's haircut, or clothing, or skin colour, or
age, or gender, or belief system... If they treat you right; you treat
them right and have respect and tolerance for their differences.(As long
as they are not hurting people...flaming crosses and lynch mobs would
not be well received here.)

Muslims here have been no more isolated than Baptists, or Catholics, or
anybody else and the incredible outpouring of support over the last week
has reflected this.

Personally, I have become irritated by attempts to blame the Police (the
perpetrator was in custody within 36 minutes of the call out and there
were heroic actions by Law Enforcement who put their lives on the line);
the gun laws (certainly, you don't need an assault rifle or MSSA to
handle things on the farm); the failure of the Intelligence Services
(how can they possibly be expected to track EVERYTHING? We don't live in
a Police State and we would fiercely resist attempts to impose one); the
irresponsibility of Social Media (definitely a factor... driven by
profit and made it possible to have the attack live streamed without any
kind of moderation. It's good to see major companies here withdrawing
advertising from Facebook and I hope the World will follow suit.
Zuckerberg has been silent and he needs to make some noise...); when the
real and final responsibility for this horrific attack rests with ONE
actor: The creature who pulled the trigger. (I don't like to call him a
"man" because that demeans men everywhere. Jacinda Ardern says she will
never say his name and most of us agree with her...)

As for Jacinda, she is a typical Kiwi girl who grew up in a small
farming community called Morrinsville about 30 minutes drive from where
I'm writing this. I didn't vote for her, but, like the rest of the
country, I have been amazed by the courage, tenacity, sensitivity, and
leadership she has shown throughout. Her actions have been inspirational
and her words have been sensible and kind. We are proud of her.

The mourning and grief will continue but even in this darkness, there
are some glimmers of light and we are all closer as a nation. I have
been amazed to see the number of my friends (and self included) who have
shed tears over this because it is so awful and so "un-Kiwi". They
should have been safe here and we feel we let them down. I can't say it
won't happen again, but I wouldn't like to be the perpetrator sentenced
to time in a NZ jail...

Pete.
(03-22-19, 02:44 PM)
In article <gfg01hFbum5U1>,
pete dashwood <dashwood> wrote:

[snip]

>1. Backup critical code EVERY DAY. (You know intellectually that you
>should, but it becomes real when disaster strikes...)
>2. Take a full image of the hard drive in critical servers at least
>every 3 months. (I had such an image that was 6 weeks old and it was a
>major help in the recovery.)


Half-right. The mantra was 'Incremental daily, system weekly'. Your backup
software (I use an version of Acronis sufficiently ancient that it hasn't
been rendered worthless by New Features) should be able to perform a 'bare
metal restore': turn off the machine you want to restore, pull out the OS
drive, put in a brand-new drive fresh out of the shrink-wrap, reboot off
a USB and restore.

DD
(03-22-19, 02:48 PM)
In article <gfghksFfcufU1>,
pete dashwood <dashwood> wrote:
>On 21/03/2019 12:51, Pascal J. Bourguignon wrote:
>> pete dashwood <dashwood> writes:


[snip]

>> As for sources, use a remote git repository, then you will always have a
>> copy of your sources there, not everyday, but on each commit/push.

>I agree with backing up more often; the above is what I would consider
>minimal.
>I don't like automatic backup because I need to control what is being
>backed up. However, I know people who do it and find it useful.


It is time to become one of those people you know.

Any time a need for human intervention is thought necessary ('I need to
control') a possibility for error or procrastination is introduced.

Let the machine's calendar do the work, the worst thing that will happen
is that you'll use more now-inexpensive drive space.

Incremental daily, system weekly. No exceptions except on the 'more
frequently' side if your site gets a lot of data input/update.

DD
Spiros Bousbouras (03-22-19, 10:04 PM)
On Thu, 21 Mar 2019 12:18:12 +1300
pete dashwood <dashwood> wrote:
> I'm sharing this experience in the hope you can profit from it.
> A few weeks ago a main development machine suffered a hard drive crash.
> This machine had over 12 million files on it...
> To be fair, it was over 7 years old, and hard drives have an MTBF of
> around 20,000 hours, so it shouldn't have been unexpected.


As someone has said "There are two kinds of storage devices , those which
have failed, and those which are about to fail."
pete dashwood (03-23-19, 01:07 AM)
On 23/03/2019 01:48, docdwarf wrote:
[..]
> Incremental daily, system weekly. No exceptions except on the 'more
> frequently' side if your site gets a lot of data input/update.
> DD Thanks for your post, Doc.


I'll consider the points you made.

Pete.
0robert.jones (03-23-19, 04:21 PM)
On Wednesday, 20 March 2019 23:18:12 UTC, pete dashwood wrote:
[..]
> PRIMA will be officially back on the air as from Monday, 25th of March.
> Cheers,
> Pete.


Glad to hear your are up and running again. I keep at least one offsite backup and anything that goes on the cloud is password protected as are all other backups.
kintalken (03-24-19, 11:39 AM)
> the look of horror on their faces when they realized that
> they were 'taking backups but had no backups' was hard. I used that
> phrase in a talk I gave at a user conference in the 80's. It was (and
> likely still would be) astonishing how many people were making backups
> but never really analyzed to see if the backups were functional!


That was the problem that enabled the Fukoshima disaster .
To stop the nuclear reaction in an emergency ,
the system used big pumps to douse the ongoing reaction with water .
Basically a big fire plan was to pump in a lot of water .
But the primary source of the electricity for the pumps was the reactor itself .
The reactor was fucked thus no electricity for the pumps .
The "backup" was some generators , cannot remember if they were batteries or diesel .
In any case the "backup" generators received no regular testing ,
and did not work at the critical moment .
Hence the reaction spiralled out of control and remains so to this day .
Nobody can figure out how to put it out .

If I was in a situation where I was keeping critical servers
in-house then I would implement a policy of
fake server disaster every 6 months .

Also I think I would have a backup server machine
constantly updated to mirror the main server
and ready to take over from the main server
at any time .

Also Pete now while it's fresh in Your mind ,
I suggest You generate a "how-to" document
with as much detail as possible about what it took
to get the new machine going .

That would be a very valuable resource for Your Self
or some one else
should it happen again .

~~~~~~~~~ %&ZHcx;. ~~~~~~~~~~
kintalken (03-24-19, 12:02 PM)
> Personally, I have become irritated by attempts to blame the Police (the
> perpetrator was in custody within 36 minutes of the call out and there
> were heroic actions by Law Enforcement who put their lives on the line);
> the gun laws (certainly, you don't need an assault rifle or MSSA to
> handle things on the farm); the failure of the Intelligence Services
> (how can they possibly be expected to track EVERYTHING?


Yes .
One of the most destructive caracteristics of the modern human IMO ---
the immediate outrage and blame thrown at every party
somehow infinitely responsible for making a mistake
or failing to perform with the perfection
required by every outraged arm chair critic .

Truth is instead of immediately throwing blame around
at other people first people should respond to such an event
by searching their own soul .
Because every time You made a racist comment ,
dissed an individual because of some class of beings
they belong to tht receives Your petty hate ,
every off-colour joke that demeans others on the basis
of their group association : all of that contributes .

And same goes for turning the perpetrator into some kind of non-human supervillain . Or turning them into an animal You are allowed to hate because of what they have done .

That person is like any other person , has loves , family ,
suffering . Understanding is much more important than villification . And much more difficult . You have to find and recognize the darkness in Your own soul , and in Your own heart , and in Your own mind , and in Your own words . All of the outrage and finger pointing is in no small part just another outpouring of identification of that which is bad to prove that You your self are good .

Perennial example of that is what every one says about the Nazis and Hitler
(03-24-19, 06:45 PM)
In article <e436510c-e0cd-448e-8442-daa81a17c7a4>,
<kintalken> wrote:
>Yes .
>One of the most destructive caracteristics of the modern human IMO ---
>the immediate outrage and blame thrown at every party
>somehow infinitely responsible for making a mistake
>or failing to perform with the perfection
>required by every outraged arm chair critic .


You think this is a 'modern human' characteristic? I'd say the 'armchair
critic' is an institution at least as ancient as the armchair.

[snip]

>Perennial example of that is what every one says about the Nazis and Hitler .
>Absolutely hated even now almost 100 years later .
>Try telling some one that if they as exactly the person they are now
>lived in Germany at that time they would have been just like any German
>.
>I have never had anyone respond in any way but to scoff at that
>suggestion and then get mad at me because insulting them .


22 Sep 1945: 'The way I see it the Nazi question is very much like a
Democrat and Republican election fight.' - Gen. George Patton.

That interview efectively ended General Patton's career.

>But it is obviously true .


At least as true as Godwin's Law.

DD
kintalken (03-24-19, 10:56 PM)
On Sunday, March 24, 2019 at 9:45:30 AM UTC-7, docd...@panix.com wrote:
> In article <e436510c-e0cd-448e-8442-daa81a17c7a4>,
> <kintalken> wrote:
> [snip]
> You think this is a 'modern human' characteristic? I'd say the 'armchair
> critic' is an institution at least as ancient as the armchair.


Yes You are probably correct .
But I have not encountered much evidence of whether or not tht characteristic was commonplace in the past .
There is of course the folk wisdom "There are 2 types of people : critics and doers" .
I suppose it makes sense that endless whinging and bitching about how everything wrong is wrong because somebody else's fault does not persist in theliterature and the science and the art that survives History ,... which isnot too surprising :-)

But I do think there is some thing a very modern phenomena at play as well ..
We are all descendants of slaves .
All of us have mostly slave genes in our inheritance .
That is because the "masters" tend to get killed by other "masters" or killed by revolt from the "slaves" .
All the "slaves" think the "masters" sit around all day drinking wine etingolives and hanging out with the beautiful ladies .
Then You have for Us the modern phenomena of the abolishment of slavery andthe assertion that all people are equal , all people have rights , etc .
Thus the situation for the modern person is that , though accustomed to having everything important managed for them by the masters , they are no longer slaves ; thus think themselves akin to a master by birthright in regardsto personal power ; but not actually responsible or capable as a master needs to be .
Thus as "masters" they demand the authority to dictate reality but as "slaves" they have no prediliction but to blame the masters for every problem .

c.f. Nietzsche , Geneoleogy of Morals
^^^ an incredible little book certainly some of the finest essays ever written , worth a read even if You do not agree with the ideas .

That BTW is not entirely off-topic in this forum .
We have probably all been in this situation :

You area lowly "code-monkey" , the highest paid of Your fellow "code-monkeys" has a salary matching the salary of a Junior "manager" or worse "design architect" .

You come into work on Monday and architect/manager/master dude shows up to tell the "code-monkeys" (aka slaves) what their next task is .
Blathers on about maximizing return-on-investment via technological innovation or some garbage like that .
Meeting is drawing to a close some one puts up their hand :
"Uhm , can we see the requirements document ????" .
Dude says "Oh yeah I have that on my cell phone we did some whiteboarding last week and I took some pictures" .
Parting words from "master" to the "slaves" :
"Oh and BTW we bought a software solution for $150,000 without consulting any technical staff and You all have to use it for this project .
Oh and by the way we allocated the man hours we think You need for this obviously simple task and if You do not meet the deadline it is three months unpaid overtime for all of You because missing the deadline is Your fault" .

~~~~~~~~ %&ZHcx;. ~~~~~~~~

Similar Threads