08-08-2014 03:59 AM - edited 08-08-2014 04:17 AM
We have a backup application which takes backup of the disk which contains the CMS library.
After the back CMS reports that file not closed by CMS.
Is there any specific way we need to open the file for the backup ?
-CMS-E-NOTBYCMS, data file DISK1:[CMS.LIB]FILE.TXT;6 not closed by CMS.
08-08-2014 04:26 AM
Please refer to the documentation for an explanation of the -CMS-E-NOTBYCMS message:
08-08-2014 04:50 AM
So you're re-inventing OpenVMS BACKUP ?
Do a DIR/FULL on that CMS file BEFORE running your 'Backup application', then another DIR/FULL afterwards. What's changed ?
Does this CMS error happen on the SOURCE disk after the backup has run or on the backup destination disk or after restoring that backup ?
08-11-2014 06:47 AM - edited 08-11-2014 07:06 AM
If you run a BACKUP/IMAGE of the disk or just a BACKUP of the CMS library to the NULL device, do you see the same problem afterwards ?
$ BACKUP DISK1:<CMS.LIB...> NLA0:x.x/SAVE
Did you repair the -CMS-E-NOTBYCMS error using CMS VERIFY/REPAIR ? Does the error show up again after the /REPAIR ? Or does the error only show up again, after you've run your backup application ?
I've seen a couple of those errors on a CMS library, after the disk has been copied with BACKUP/IMAGE during a migration from a real Alpha to the CHARON-AXP emulator. These errors disappeared after a CMS VERIFY/REPAIR.
During the VERIFY/REPAIR phase, the messages printed are:
%CMS-I-FIXHDR, file header of element data file DISK1:<xxx..CMSLIB.CMS$000>xxx.xxx;1 repaired
This seems to indicate, that data in the 'file header' had to be repaired. If DIR/FULL does not show any differences, maybe DUMP/HEADER/BLOCK=COUNT=0 does ?
Does DIR/DATE=ACCESSED on the CMS element file report any time values ?
08-12-2014 09:19 AM
The usual trigger for this error is a corrupt CMS library, but there are many ways for that to arise. Directly editing the library is a common trigger. It's entirely unexpected for a BACKUP to cause this, but it seems quite possible that an errant online BACKUP of a CMS library — online BACKUP operations are sketchy at best — followed by a restoration might trigger this.
Usual fix for a CMS library is a CMS tool that copies the contents of the corrupt library into a newly-created library. The HP support folks had access to such a tool, and there may be other tools to extract data from a corrupt library around, though whether those are freely available is another matter.
It'll probably be faster to escalate this question to HP or VSI support, and have them take a look at the commands used and at the libraries involved, at the configuration, and the errors.
Or gain access to the CMS library replication tool, and see what that can recover from the CMS library.
FWIW, I can't tell what version of VMS and CMS are in use here, what BACKUP command was used, nor whether this CMS error is arising after some sort of BACKUP had copied the CMS library or if the error arose after said (online?) BACKUP was restored.
08-13-2014 12:27 AM
Thanks for all the input.
I could reproduce the problem by opening the file by edit i.e. edit/tpu/READ_ONLY filename.
I have attached the steps I have followed to re-produce the problem.
Additional information is that the file is having ACL's.
Please let me know if you need any further details.
08-13-2014 12:42 AM
When you said 'Did not find any difference between DIR/FULL output from before and after backup', you probably did not look close enough at the date/time fields.
I'm assuming, that CMS checks ALL the values from the file header of the file, which includes the date/time fields. In your test (adding ACLs to the file), you might recognize the modified date/time values:
Created: 23-JUL-2014 16:17:22.61
Revised: 13-AUG-2014 12:03:44.78 (3)
Expires: <None specified>
Backup: <No backup recorded>
Effective: <None specified>
Recording: <None specified>
Accessed: 23-JUL-2014 16:17:22.63
Attributes: 13-AUG-2014 12:03:44.78
Modified: 23-JUL-2014 16:17:22.63
As you seem to have ACCESS_DATES enabled on your disk ($ SET VOL/VOLUME_CHARACTERISTICS=ACCESS_DATES), those Accessed, Attributes and Modified dates will also be modified based on the corresponding access.
As Steven pointed out, try using FIB$M_NORECORD to try to prevent those Access Dates from being modified (even for Readonly access).
08-13-2014 07:34 AM
BACKUP with /RECORD is compatible with CMS, or some rather large collections of CMS libraries would have blown up badly, so this is either specific to the local configuration, or there's a bug in the (unidentified) version of CMS.
CMS is also compatible with some gonzo ACLs. As a test, remove those ACLs from the reproducer, and test again. This to determine of those ACLs are relevant to reproducing the bug here, or if that's just some irrelevant chatter in the reproducer. As an additional and related test, engage BYPASS and test the verification again — that'll eliminate any problems that might have been triggered by those ACLs for the process performing the CMS verification.
If this is being triggered by the last access date, then the (unidentfied) version of CMS is broken. Which means either upgrading or patching CMS if the version is not current, or reporting the bug to HP, or maybe using a workaround.
08-14-2014 04:27 AM
There is a correction in reproducing the problem.
The problem is reproduced by applying the ACL's on file directly, not through the CMS.
This is causing the Revised date to be modified, hence problem is occuring.
I couldn't re-produce the problem in my test system.
The problem can be with some of the CMS settings or with ACL's. so finding the correct arguments for "sys$qio".
I will try to collect more information from the customer environment.
Once again thanks for all your help.
08-14-2014 06:00 AM
>>> The problem is reproduced by applying the ACL's on file directly, not through the CMS.
Exactly, nobody should mess with files, which are under the control of CMS.
For what it is worth: obviously CMS records the last revision, it seems it records the revision number. If you manage to change the revision number, CMS VERIFY will complain. A simple reproducer (independent of ODS5/2) is to rename the file to itself with "$ rename file.txt ;". That bumps up the revision number (and the revision date): CMS VERIFY will complain. (On ODS5 you can manage to change the revision date without changing the revision number and CMS VERIFY will not complain.)
Setting protections for a file or adding an ACE/ACL changes the revision number as well.
08-14-2014 05:50 PM
This is likely either a CMS bug, or an undocumented restriction.
Upgrade to current or patch to current CMS, then escalate to HP if this problem persists.
I know this configuration used to work. In a previous incarnation, I had ACLs attached to ~600 separate CMS libraries and modules — though the CMS directories did require different ACLs from those attached to contents of the CMS libraries — and that all worked just fine. Including adding and updating those ACLs on a nightly basis. (Though those disks did not have access dates enabled, IIRC.) Which implies that there's something wrong with the unspecified CMS version here, and its processing of file metadata.
08-15-2014 01:56 PM
>>> Which implies that there's something wrong with the unspecified CMS version here, and its processing of file metadata.
Which implies that a change in the metadata of a CMS object is something CMS should tolerate? I don't think so.
The file in question was in a [.CMS$000] sub-directory. The file is the internal representation, the implementation, of an CMS object, an element of the CMS library. It is no longer a user's file.
Whatever the OP wants to achieve with setting/adding ACEs/ACLs CMS should know about it. Bypassing CMS looks wrong to me, as CMS has to preserve/guarantee the integrity of its objects.
08-19-2014 02:52 PM
>>> Which implies that a change in the metadata of a CMS object is something CMS should tolerate? I don't think so.
I disagree. CMS should certainly care about changes to the file data, but it should not be sensitive to the metadata.
If adding an ACL to a CMS file or directory is tipping over CMS — or any other changes to the metadata that are not also relevant to the stored file data — then CMS is broken.
I've used ACLs with CMS before. Heavily. For a product y'all are familiar with, too. These ACLs to manage access control, which is no small part of managing source code. CMS has worked just fine with these ACLs, too. The identifiers and the subsystem identifiers all worked exactly as expected, and allowed managing the available user access to the CMS libraries.
That these ACLs worked implies that whatever is happening here is a change to CMS and/or VMS.
And yes, sure, maybe CMS might read the metadata to see if it might then need to invoke a more involved checksum on the files to ensure consistency, but there's no reason to squawk if the file data is found consistent. Worst case, a CMS repair pass clears this. But that repair pass should not be necessary.