2018-10-18

preupgrade.jar - enforced recommendations



Recently I upgraded a database from version 12.1 to 12.2.
I take upgrades serious - similar (but not as skilled) to Mike Dietrich. So I downloaded latest preupgrade.zip and unzipped it [I made a small error by unzipping it NOT to $ORACLE_HOME/rdbms/admin but to a temporary directory - later more about this].
Beside such errors, I try to follow Oracles recommended Upgrade Process.

At some point Oracle recommends to Gather Dictionary Stats "the night before starting the upgrade".

For me this is a recommendation, not an obligation. I had (from my perspective) very good reasons NOT to follow this recommendation.

Still, during the upgrade, the preupgrade.log shows
+ (AUTOFIXUP) Gather stale data dictionary statistics prior to database
     upgrade in off-peak time using:

      EXECUTE DBMS_STATS.GATHER_DICTIONARY_STATS;

     Dictionary statistics do not exist or are stale (not up-to-date).

     Dictionary statistics help the Oracle optimizer find efficient SQL
     execution plans and are essential for proper upgrade timing. Oracle
     recommends gathering dictionary statistics in the last 24 hours before
     database upgrade.

     For information on managing optimizer statistics, refer to the 12.1.0.2
     Oracle Database SQL Tuning Guide.

If they say AUTOFIXUP - treat this as a promise (or treat).

After preupgrade.jar, it's to run preupgrade_fixups.sql.
This calls (among others)
fixup_result := dbms_preup.run_fixup('dictionary_stats');
If you go through the package and other functionality, you will find function dictionary_stats_check in preupgrade_package.sql. And there it runs
BEGIN
    EXECUTE IMMEDIATE
     ' select 1 from dual where exists(
              select distinct operation
                from DBA_OPTSTAT_OPERATIONS
                        where operation =''gather_dictionary_stats''
                        and start_time > systimestamp -  INTERVAL ''24''  HOUR) '
      INTO dictionary_stats_recent;

and if no dictionary stats were conpleted recently, this function is executed:

-- *****************************************************************
--     This fixup executes dictionary stats pre upgrade
-- *****************************************************************
FUNCTION dictionary_stats_fixup          (
         result_txt IN OUT VARCHAR2,
         pSqlcode    IN OUT NUMBER) RETURN NUMBER
IS
    stats_result BOOLEAN;
    sys_string varchar2(5):='SYS';
BEGIN
   stats_result := run_int_proc('DBMS_STATS.GATHER_DICTIONARY_STATS', result_txt, pSqlcode);

   IF (stats_result) THEN
       RETURN c_success;
   ELSE
       RETURN c_failure;
   END IF;
END dictionary_stats_fixup;

That's the moment where I recognised how serious Oracle takes gathering of dictionary stats.

A little rant on twitter about this aggressive recommendation gave me a nice reply where the recommendation is stated:

🤔

At least I know where to change the code - but it must be done before preupgrade.jar is run!

But there is a 2nd situation when DBMS_STATS.GATHER_DICTIONARY_STATS is run:
After the upgrade itself, postupgrade_fixups.sql must run.
Here the side note to my initial statement why preupgrade.zip MUST be unzipped to $ORACLE_HOME/rdbms/admin:
VARIABLE admin_preupgrade_dir VARCHAR2(512);

REM
REM    point PREUPGRADE_DIR to OH/rdbms/admin
REM
DECLARE
    oh VARCHAR2(4000);
BEGIN
    dbms_system.get_env('ORACLE_HOME', oh);
    :admin_preupgrade_dir := dbms_assert.enquote_literal(oh || '/rdbms/admin');
END;
/

DECLARE
    command varchar2(4000);
BEGIN
    command := 'CREATE OR REPLACE DIRECTORY PREUPGRADE_DIR AS ' || :admin_preupgrade_dir;
    EXECUTE IMMEDIATE command;
END;
/

@?/rdbms/admin/dbms_registry_basic.sql
@?/rdbms/admin/dbms_registry_extended.sql


REM
REM    Execute the preupgrade_package from the PREUPGRADE_DIR
REM    This is needed because the preupgrade_messages.properties file
REM    lives there too, and is read by preupgrade_package.sql using
REM    the PREUPGRADE_DIR.
REM
COLUMN directory_path NEW_VALUE admin_preupgrade_dir NOPRINT;
select directory_path from dba_directories where directory_name='PREUPGRADE_DIR';
set concat '.';
@&admin_preupgrade_dir./preupgrade_package.sql
That's the reason why unzipping it ton another directory, after the upgrade an older version of preupgrade scripts is used - maybe not a desired thing!

Later there another fixup is called:
fixup_result := dbms_preup.run_fixup('post_dictionary');
This in that case that function is run:

-- *****************************************************************
--     POST_DICTIONARY_CHECK Section
--     This check recommends re-gathering dictionary stats post upgrade
--     The logic in the query is:  Check if statistics has been taken
--     after upgrade, if not report it and generate the fixup in the
--     postupgrade fixup script, after the fixup run, it will not fail
--     and therefore it will report this check as successfull.
-- *****************************************************************
FUNCTION post_dictionary_check          (result_txt OUT VARCHAR2) RETURN NUMBER
IS
dictionary_stats_recent  NUMBER := 0;
correct_version boolean := TRUE;

BEGIN
  IF dbms_registry_extended.compare_versions(db_version_4_dots, C_ORACLE_HIGH_VERSION_4_DOTS, 4) < 0 THEN
     correct_version := FALSE;
  END IF;

  BEGIN
    EXECUTE IMMEDIATE
     ' select 1 from dual where exists(
              select distinct operation
                from DBA_OPTSTAT_OPERATIONS
                        where operation =''gather_dictionary_stats''
                        and start_time > (select max(OPTIME) from registry$log where cid =''UPGRD_END'')) '
      INTO dictionary_stats_recent;
    EXCEPTION
      WHEN NO_DATA_FOUND then
        null;
  END;

After all this, I STILL recommend to use LATEST preupgrade.jar,
unzip it to proper $ORACLE_HOME/rdbms/admin,
execute it as documented
and don't mess around with it!
Don't mess around with it unless Oracle Support says so (or you take all the unforseen side effects on your credits).

2018-10-11

SQL Real Time Monitoring pure HTML report (thanks to SQLDeveloper 18.3)

Yesterday (from the writing of this post perspective) SQLDeveloper 18.3 came out.
(it seems SQL Developer does not strict follow Oracle Database Release Number schema, otherwise it must be 18.4 already as it's released in 4th quarter of 2018)

There are many bugs fixed on 18.3 and some nice enhancements there as well.

My favorite enhancement is Real Time SQL Monitoring, HTML exports now available - no flash!

As Tanel Poder wrote about it already (Generate Oracle SQL Monitoring Reports as HTML using SQL Developer v18.3 (no Flash needed)) I'll show how you can use it without SQL Developer.
I'm not against SQL Developer, in fact I'm very happy with it!
Just the Real Time SQL Monitor tab seems kind of unresponsive to me, especially on DBs with many active sessions and high load.


So if I don't want to watch a java process drawing circles and showing blur previews, I first go to query v$sql_monitor. Beside all the columns I'll need to identify my SQL of interest, I need the columns SQL_ID, SQL_EXEC_ID, and SQL_EXEC_START.
And, of course at least once you must create a new HTML Report with SQL Developer! Save this report as a template for later use.

When you open the HTML report in an editor, there are 2 lines of special interest:
 var data_sqlId = '3v64dcg0rja6k'; 
 var data_xml = '<report db_version="12.1.0.2.0...>"

You can replace them easily with the context of your SQL of interest.
To get the proper XML you can use DBMS_SQLTUNE.report_sql_monitor. But as this function provides a multiline XML, but the HTML expects the XML in one line, the call should be
SELECT XMLSerialize( DOCUMENT xmltype(DBMS_SQLTUNE.report_sql_monitor(
  sql_id         => '3v64dcg0rja6k',
  type           => 'XML',
  SQL_EXEC_ID    => 33554648, 
  SQL_EXEC_START => to_date('2018-10-10 05:36:17'),
  report_level   => 'ALL')) NO INDENT ) AS XML_report
FROM dual;
If you omit SQL_EXEC_ID or SQL_EXEC_START, the report will still compile but you should understand the difference in data you get displayed.

So the minimal count you create a new HTML-Only SQL Real Time Monitoring report with SQL Developer should be once. But if you read this ( and your DBs are properly licensed), spread the good news with all the people who might benefit from these reports!


Please keep in mind that the HTML file is loading CSS and JS files from and URL at Oracles Content Delivery platform - so they are subject to any change without a new release of SQL Developer, RDBMS or anything. So expect unexpected changes every time!

2018-10-01

seing your DB as it was some minutes ago

Last week during a discussion with a colleague we thought if it would make sense to have in SQLDeveloper the possibility to see the system "as it was some minutes ago".
Small errors can happen and also resource control isn't always perfect in every company.

Björn Rost suggested to use DBMS_FLASHBACK.ENABLE_AT_TIME:

A very clever suggestion!

Let's first look at the documentation:



DBMS_FLASHBACK


Using DBMS_FLASHBACK, you can flash back to a version of the database at a specified time or a specified system change number (SCN).

and in more detail:

DBMS_FLASHBACK Overview

DBMS_FLASHBACK provides an interface for the user to view the database at a particular time in the past, with the additional capacity provided by transaction back out features that allow for selective removal of the effects of individual transactions. This is different from a flashback database which moves the database back in time.
When DBMS_FLASHBACK is enabled, the user session uses the Flashback version of the database, and applications can execute against the Flashback version of the database.
DBMS_FLASHBACK is relevant only for the session in which it's called, so it doesn't change the database.
But it's unclear what's meant by database in this context.

Here a small testcase which shows some unexpected results of DBMS_FLASHBACK:
(I slightly edite the text for better readability)

drop table x1;
create table x1 (u number);

insert into x1 (u) values (1);

CREATE OR REPLACE FUNCTION "RETURN_SOMETHING" (
    i NUMBER
) RETURN VARCHAR2 AS 
-- Version 1
BEGIN
    IF ( i = 1 ) THEN
        RETURN 'red';
    ELSE
        RETURN 'blue';
    END IF;
END return_something;
/


select to_char(sysdate, 'YYYY-MM-DD HH24:MI:SS') now, u, return_something(u) rrr,
     dbms_flashback.get_system_change_number SCN
from x1;

NOW                          U RRR          SCN
------------------- ---------- ----- ----------
2018-10-01 19:37:56          1 red     24042787


exec sys.DBMS_SESSION.sleep(60);
There should be nothing fancy up to this time.
The table and function are created and the select works fine.
I need the DBMS_SESSION.sleep to copy/paste the proper timestamp into the next block of code:

update x1 set u = 2 where u = 1;

1 row updated.

CREATE OR REPLACE FUNCTION "RETURN_SOMETHING" (
    i NUMBER
) RETURN VARCHAR2 AS 
-- Version 2
BEGIN
    IF ( i = 1 ) THEN
        RETURN 'oans';
    ELSE
        RETURN 'zwoa';
    END IF;
END return_something;
/

select to_char(sysdate, 'YYYY-MM-DD HH24:MI:SS') now, u, return_something(u) rrr
from x1;

NOW                          U RRR  
------------------- ---------- -----
2018-10-01 19:39:32          2 zwoa 


Still nothing spectacular here - just the preparation done.
Now onto the interesting part:

EXEC dbms_flashback.enable_at_time(to_timestamp('2018-10-01 19:38:00','YYYY-MM-DD HH24:MI:SS'));


select to_char(sysdate, 'YYYY-MM-DD HH24:MI:SS') now, u, return_something(u) rrr
from x1;

NOW                          U RRR  
------------------- ---------- -----
2018-10-01 19:39:33          1 oans 



col text for A50
select text
from dba_source
where owner='BERX'
and name='RETURN_SOMETHING'
order by LINE asc;

TEXT                                              
--------------------------------------------------
FUNCTION        "RETURN_SOMETHING" (
    i NUMBER
) RETURN VARCHAR2 AS 
-- Version 1
BEGIN
    IF ( i = 1 ) THEN
        RETURN 'red';
    ELSE
        RETURN 'blue';
    END IF;
END return_something;

11 rows selected. 


Here we can see how DBMS_FLASHBACK.ENABLE_AT_TIME is set to a time between the 1st insert & create function and the 2nd block.
With this setting, the content of table X1 is as expected. Also DBA_SOURCE shows the code of RETURN_SOMETHING.
But the function itself is not changed in memory and works as of it's state NOW, not at the given flashback time.
The flashback time version of RETURN_SOMETHING is even visible when you open it in SQLDeveloper (you have to believe me or test it).
Don't forget to clean up after the tests:

exec dbms_flashback.disable;

select to_char(sysdate, 'YYYY-MM-DD HH24:MI:SS') now, u, return_something(u) rrr
from x1;

NOW                          U RRR  
------------------- ---------- -----
2018-10-01 19:39:33          2 zwoa 


For this testcase no COMMIT was used. ;-)

2018-09-04

TFA reports full dba_feature_usage_statistics to MyOracleSupport

Those who deal with MyOracleSupport (MOS) regularly will have noticed their demand for Trace File Analyzer (TFA) diagnostic collections.
Few will know they are also providing a full content of their DBs DBA_FEATURE_USAGE_STATISTICS view; at least when running the parameters
tfactl diagcollect -database <DB_UNIQUE_NAME>

The report calls tfa_home/resources/sql/db_feature_usage.sql and it's result is stored in the <hostname>.tfa_<timestamp>.zip file.

I did not find any mention about this behavior in the documentation, so it might be worth to post about it.

This is at least true for version 18.3.0.0.0 (latest version available at 2018-09-04)

RBAL process needs to release FORMER disk

Oracles Automatic Storage Management (ASM) has many nice advantages against other volume managers. One of my favorite is to add and remove or rebalance disks without any interruption for the database. Unfortunately the remove of devices after the rebalance is completed is not as straight forward as expected: 
In a current environment (Grid Infrastructure: 12.2.0.1.180417, rdbms: 12.1.0.2.180116 and others) after the ALTER DISKGROUP xxx DROP DISK yyy specific discs are in status FORMER
Due to  a specific schema of directories & symlinks it takes some effort to come to the real /dev/dm-xxx device. 
Still several RBAL processes of different DBs (I did not find a derterministic pattern which processes or DBs) hold a file handle on this specific device. To make it more interesting is the fact that the diskgroup (DG) or disk was never used by any of those DBs. We have a strict mapping of DBs to DGs, so it's easy for me to claim that. 

To tell RBAL processes to release this "stale" filehandle, I have to run a
  ALTER DISKGROUP zzz REBALANCE
Any DG does the trick, it is not required to be related to the DB of RBAL process at all. 
As diskgroups are normally balanced, this is a kind of NOP. Still it's required to wake up RBAL and makes it release unused filehandles. 

I still have no idea why these RBAL processes open a device it's DB never needs at all; 
but as I have my proper workaround, I don't investigate any further. 

Might this help someone out there, or at least me when I hit this odd behavior again ;-) 

2018-08-08

unwanted cleanup in /var/tmp/.oracle

In my previous blog missing oraagent.bin process I raised a question:
why does files in /var/tmp/.oracle disappear? 
At least this question I am now able to answer.

By the help of Linux kernel's audit system and a proper rule:
-a always,exit -F arch=b64 -S rename,rmdir,unlink,unlinkat,renameat -F path=/var/tmp/.oracle -F key=ora.delete

I was able to see the culprit process:
----
type=PROCTITLE msg=audit(08/06/2018 17:13:48.382:58122) : proctitle=/usr/bin/systemd-tmpfiles --clean 
type=PATH msg=audit(08/06/2018 17:13:48.382:58122) : item=1 name=sCRSD_IPC_SOCKET_11 inode=50924979 dev=00:25 mode=socket,777 ouid=root ogid=root rdev=00:00 nametype=DELETE 
type=PATH msg=audit(08/06/2018 17:13:48.382:58122) : item=0 name=/ inode=45518 dev=00:25 mode=dir,sticky,777 ouid=root ogid=root rdev=00:00 nametype=PARENT 
type=CWD msg=audit(08/06/2018 17:13:48.382:58122) :  cwd=/ 
type=SYSCALL msg=audit(08/06/2018 17:13:48.382:58122) : arch=x86_64 syscall=unlinkat success=yes exit=0 a0=0x5 a1=0x556fb11fb17b a2=0x0 a3=0x3 items=2 ppid=1 pid=25307 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=systemd-tmpfile exe=/usr/bin/systemd-tmpfiles key=ora.delete 
----

So it was a feature of systemd: systemd-tmpfiles.

In detail it comes from
/usr/lib/tmpfiles.d/tmp.conf 
this particular line:
v /var/tmp 1777 root root 30dy

The fix for this is easy: create a proper file (copy) in
/etc/tmpfiles.d
and add these lines at the end:
# Exclude /var/tmp/.oracle
x /var/tmp/.oracle
But as a research at MOS and google did not come up with any issue with systemd-tmpfiles and crsd.bin, oraagent.bin or any other keyword, something else must be a problem here as well.
This (and the other question from my previous post) are questions for further investigations.

So much for now.

2018-07-27

missing oraagent.bin process

It all started with a small error when I tried to stop/start a service:

srvctl start service -d cmpp01_site1 -s cmpp01
PRCD-1084 : Failed to start service CMPP01 
PRCR-1079 : Failed to start resource ora.cmpp01_site1.cmpp01.svc 
CRS-2680: Clean of 'ora.cmpp01_site1.cmpp01.svc' on 'av2l954p' failed 
CRS-5802: Unable to start the agent process 

It's not fine the agent can not be started. so let's check the logfile crsd_oraagent_oracle.trc:
2018-07-23 21:29:14.339 : CRSCOMM:2939654272:  Ipc: Starting send thread
2018-07-23 21:29:14.340 :GIPCXCPT:2939654272:  gipcInternalConnectSync: failed sync request, ret gipcretConnectionRefused (29)
2018-07-23 21:29:14.340 :GIPCXCPT:2939654272:  gipcConnectSyncF [connectToServer : clsIpcClient.cpp : 380]: EXCEPTION[ ret gipcretConnectionRefused (29) ]  failed sync connect endp 0x1cbb5b0 [0000000000000090] { gipcEndpoint : localAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=)(GIPCID=00000000-00000000-0))', remoteAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=CRSD_IPC_SOCKET_11)(GIPCID=00000000-00000000-0))', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj 0x1cbe170, sendp 0x1cbdf20 status 13flags 0xa108871a, flags-2 0x0, usrFlags 0x30000 }, addr 0x1cbc990 [0000000000000097] { gipcAddress : name 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=CRSD_IPC_SOCKET_11)(GIPCID=00000000-00000000-0))', objFlags 0x0, addrFlags 0x4 }, flags 0x0
2018-07-23 21:29:14.340 : CRSCOMM:2939654272:  IpcC: gipcConnect() failed, rc= 29
2018-07-23 21:29:14.340 : CRSCOMM:2939654272: [FFAIL] IpcC: Could not connect to (ADDRESS=(PROTOCOL=IPC)(KEY=CRSD_IPC_SOCKET_11)) ret = 29

It clearly says there is something going wrong, unfortunately it doesn't tell in detail what's going wrong or what's required.

Maybe the most beautiful but useless analysis here was provided by tfactl summary -crs:
DETAILS                            STATUS    COMPONENT  
+----------------------------------+---------+-----------+
  .------------------------------.   PROBLEM   CRS        
  | CRS_SERVER_STATUS   : ONLINE |                        
  | CRS_STATE           : ONLINE |                        
  | CRS_INTEGRITY_CHECK : FAIL   |                        
  | CRS_RESOURCE_STATUS : ONLINE |                        
  '------------------------------'                        
+----------------------------------+---------+-----------+

As neither My Oracle Support nor google helped a lot here, I started a random cry on twitter:


And Anil had mercy on me :
This was not the answer to my problem, but it started a very valuable conversation in which I learned enough to identify the issue.

It's important to know that (this) oraagent.bin process is started by crsd.bin, and it is restarted if it disappears for any reason.
So to see what's really going and and leads to the error above, running strace on crsd.bin and it's forks/clones:
strace -f -p <pid_of_crsd.bin> -o <outfile>

as expected crsd.bin creates a new process for oraagent.bin:
18631 clone( <unfinished ...>
...
18631 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fe273dfe9d0) = 18633
...
18633 execve("/appl/grid/grid_12201/bin/oraagent.bin", ["/appl/grid/grid_12201/bin/oraage"...], [/* 52 vars */]) = 0

and if everything runs as expected the socket handling looks like
18633 socket(AF_LOCAL, SOCK_STREAM, 0 
...
18633 <... socket resumed> )            = 69  <== the creation of the socked
...
18633 connect(69, {sa_family=AF_LOCAL, sun_path="/var/tmp/.oracle/sCRSD_IPC_SOCKET_11"}, 110 <unfinished ...>
...
18633 <... connect resumed> )           = 0   <== the successful connection of the sockets file descriptor 69 to  /var/tmp/.oracle/sCRSD_IPC_SOCKET_11

But in my problematic case it's this:
22612 connect(69, {sa_family=AF_LOCAL, sun_path="/var/tmp/.oracle/sCRSD_IPC_SOCKET_11"}, 110  <unfinished ...>
...
22612  <... connect resumed> )           = -1 ENOENT (No such file or directory)

The first finding here:
The error
gipcInternalConnectSync: failed sync request, ret gipcretConnectionRefused (29)
in the logfile correlates with
<... connect resumed> )           = -1 ENOENT (No such file or directory)

With this specific information a short check in /var/tmp/.oracle shows the missing file is not there.
A crosscheck in /proc/<pid_of_crsd.bin>/fd/ shows
lrwx------. 1 root root 64 Jul 26 22:00 /proc/2144/fd/214 -> /var/tmp/.oracle/sCRSD_IPC_SOCKET_11_lock (deleted)

this all happens on grid binaries 12.2 with latest patch set.


I showed now:
1) if oraagent.bin disappears, crsd.bin creates a new oraagent.bin process
2) the oraagent.bin process tries to connect a socket on /var/tmp/.oracle/sCRSD_IPC_SOCKET_11 to communicate with crsd.bin
3) if 2) fails, gipcretConnectionRefused (29) is shown in the logfile.



Right now I know:
If /var/tmp/.oracle/sCRSD_IPC_SOCKET_11 is missing but oraagent.bin does run, I can operate the cluster resources, but whenever oraagent.bin disappears, cluster resources can not be managed anymore.

If this is the only problem (and no other sockets in /var/tmp/.oracle are missing) I can restart crsd which will also restart oraagent.bin and everything works like expected.
crsctl stop res ora.crsd -init
CRS-2673: Attempting to stop 'ora.crsd' on 'node1'
CRS-2677: Stop of 'ora.crsd' on 'node1' succeeded

crsctl start res ora.crsd -init  
CRS-2672: Attempting to start 'ora.crsd' on 'node1'
CRS-2676: Start of 'ora.crsd' on 'node1' succeeded


My next questions to answer is
1) why does oraagent.bin disappear at all?
2) why does files in /var/tmp/.oracle disappear?

This is still not answered and under investigation.

2018-07-23

enabling Database Vault on e-business RAC database

Right now I'm in a process to setup Database Vault for an E-Business suite database. This is a 2 node RAC cluster.
The DB is 12.1.0.2 with April 17 BP.

As the DB exists already, I followed How To Enable Database Vault in a 12c database ? (Doc ID 2112167.1).
Everything looks smooth, but unfortunately, at the Configuration of DV
exec dvsys.configure_dv('DVOWNER','DVMANAGER');
failes with
ERROR at line 1: 
ORA-47500: Database Vault cannot be configured. 
ORA-06512: at "DVSYS.CONFIGURE_DV", line 23 
ORA-06512: at "DVSYS.CONFIGURE_DV", line 72 
ORA-06512: at line 2 

ORA-47500 tells
47500, 00000, "Database Vault cannot be configured."
// *Cause: The Database Vault configuration failed because some Database Vault  
//         objects were missing.
// *Action: Run the dvremov.sql script to remove Database Vault and then
//          reinstall Database Vault.

This was the first try: remove DV and start again.
Nothing changed.


Some research on MOS led to Known Issues In A Database Vault Environment (Doc ID 2330775.1)
there the explanation for ORA-47500 is:

Cause: There are many invalid objects in the database.

Solution: Run $ORACLE_HOME/rdbms/admin/utlrp.sql to validate the invalid objects

So all invalid objects were removed/fixed and - DV enabled again.
With same error.


A lot more investigation and a SR was required here.

When catmac.sql was run it completed and didn't show any hints it failed at the end.
Especially register DB and it's validation were shown successful:
SQL> DECLARE
  2      num number;
  3  BEGIN
  4      dbms_registry.loaded('DV');
  5      SYS.validate_dv;
  6  END;
  7  /

PL/SQL procedure successfully completed.


After a lot more investigation, an issue in the (manually activated) spool of catmac.sql was found:
ERROR at line 1:
ORA-31011: XML parsing failed
ORA-19213: error occurred in XML processing at lines 49
LPX-00217: invalid character 21 (U+0015)
ORA-06512: at "DVSYS.DBMS_MACADM", line 160
ORA-06512: at line 2
and this error stack leads to Installation Of Database Vault Fails With ORA-19213, LPX-00217: Invalid Character 21 (U+0015) (Doc ID 2324871.1)
As so often: the problem looks obvious, but only after it was found.


The problem is the environment variable
ORA_NLS10=<ORACLE_HOME>/nls/data/9idata
Which is a requirement for Oracle E-business Suite.

Here RAC has it fair share of complexity as this environment is not (only) defined in a .bash_profile or similar, but also in cluster registry:
srvctl getenv database -d <DB_UNIQUE_NAME>
EBSSIB01_SITE1:
ORA_NLS10=<ORACLE_HOME>/nls/data/9idata

The fix was to disable this setting:
srvctl unsetenv database -d <DB_UNIQUE_NAME> -t ORA_NLS10
restart all instances
and now successfully enable DV at last!

Of course, afterwards ORA_NLS10 must be set again
srvctl setenv  database -d <DB_UNIQUE_NAME> -t "ORA_NLS10=<ORACLE_HOME>/nls/data/9idata"
and instances restarted once more.

2018-07-19

OUD - no operational route

Oracle Universal Directory is by far not my preferred tool.
Probably it's me, I'm just not used to it.

The latest joy was an error when I tried to run
eusm addDatabase
and I got
javax.naming.AuthenticationException: [LDAP: error code 49 - Invalid Credentials]

Very unspecific, so I logged into Oracle Unified Directory Services Manager (OUDSM)


There a click on Data Browser showed another error message:
LDAP: error code 52 - The load balancing workflow element 
has no operational route. 
This may happen when no route has been configured 
or all the configured routes are down.






Google was of no help here, but it was obvious something is "down":

As there was no change on OUD, I asked admins of those proxy data sources - my friends the AD admins.
They confessed they changed something: they changed authentication to "ldap_bind: Strong authentication required"

This leads to a quite straight forward solution:
Enable LDAPS and import "their" keys into OUDs Trust Manager:

First import the keys:
I decided to go for the preferred keystore: JKS
First the password for the keystore: it's located in <OUD>/config/keystore.pin
and the keys installed with
<java_home>/bin/keytool \
-importcert -alias priv-root -file /tmp/priv-root.cer \
-keystore config/truststore -storetype JKS






And of course the config must be adapted also:

With these settings, the Data Sources are reachable again - everything fine.
(I had to restart OUD that all Data Sources were available - no clue why)

If you ever hit such an issue, don't follow my approach blindly.
Might it help at least a little big.

2018-06-26

when it runs on your computer ...

... please don't see it ready for customers.

I try to enable Database Vault on an existing 12.2 database.
A proper document to follow is How To Enable Database Vault in a 12c database ? (Doc ID 2112167.1)

At some point it tells to run
@$ORACLE_HOME/rdbms/admin/catmac.sql system temp <system_password>

unfortunately this fails with
CREATE USER dvsys IDENTIFIED BY "D_DVSYSPW"
*
ERROR at line 1:
ORA-28003: password verification for the specified password failed
ORA-20025: Password must contain at least 1 digit(s)
(you need to spool the results into a file to find it in the output stream)

The reason is obvious:
SQL> select * from dba_profiles where resource_name='PASSWORD_VERIFY_FUNCTION' and Profile='DEFAULT';

PROFILE                        RESOURCE_NAME                    RESOURCE LIMIT                                    COM
------------------------------ -------------------------------- -------- ---------------------------------------- ---
DEFAULT                        PASSWORD_VERIFY_FUNCTION         PASSWORD ORA12C_VERIFY_FUNCTION                   NO

and so is the solution:
alter profile default limit PASSWORD_VERIFY_FUNCTION null;
@$ORACLE_HOME/rdbms/admin/catmac.sql system temp <system_password>
alter profile default limit PASSWORD_VERIFY_FUNCTION ORA12C_VERIFY_FUNCTION;


Oracle requires to disable a security feature to enable another one.
In this sandbox environment it's not a specific, hand-made password verify function. It's the default one from oracle.
It would be very nice if oracle would test it's security features to run with each other.

There is no problem for me, but I'm wondering how serious (holistic) Oracle takes "security"?

2018-06-20

preupgrade.jar version number

Mike Dietrich is very motivated to convince everyone who wants to take Oracle database upgrades serious to use always the latest version of preupgrade.jar.

Some parts of Note:884522.1 - How to Download and Run Oracle's Database Pre-Upgrade Utility
 are misleading in my point of view:
If the upgrade-to version is 12.2 or higher, then save the file to your target $ORACLE_HOME/rdbms/admin directory and then unzip the file. It could be your source or target $ORACLE_HOME.
but this will be sorted out for sure in the new future.

For me deploying some files to an ORACLE_HOME (after it's own well defined deployment process) sounds creepy, and I'm interested how this will fit into the concept of read-only ORACLE_HOMEs in 18c+.

As I don't want to modify anything (even replacing with the identical files in case latest preupgrades.jar is deployed already) it's better to check the version of current preupgrade.jar.

If it is used already, the Build number is visible in
preupgrade.log:
Report generated by Oracle Database Pre-Upgrade Information Tool Version
12.2.0.1.0 Build: 14

Upgrade-To version: 12.2.0.1.0

=======================================
Status of the database prior to upgrade
=======================================
...

But if it is not executed (yet), you have to get the Build number from the jar file.
At the moment (Build 014) you can get it from preupgrade_package.sql by a simple statement:

unzip -p $PATH_TO_PREUPGRADE/preupgrade.jar preupgrade_package.sql | \
awk '$1 ~ /c_build/ { print $5 }' | tr -d \;

This can help to identify the current installation and if a (re-)installation is really required.



unfortunately there are some inconsistencies.
e.g. in preupgrade_181_cbuild_2_lf.zip:
if you unzip this file you will get a preupgrade_package.sql with c_build is 2
but the preupgrade_package.sql inside of preupgrade.jar shows c_build is 1

2018-05-30

flipping performance

Recently I had a request to check "if there is any problem with the database at <specific times>".
You can imagine, there was no problem. Nothing in alert.log, no tracefiles, no locks or oddities in ASH/AWR.

I had to ask back & forth to get some more information about the issue. The information I got was:
"we use a statement SELECT * FROM table(some_function('P1', 'P2')) - and it took longer than 10 sec at the given times". Of course there were no bind variables used but constants every time.
This situation helps a lot as obviously there is nothing to do with SQL_IDs now, and the real issue is (probably) within the function.
The function just generated 1 SELECT (no BINDs again - but PL/SQL did the "auto-binding").
With this SQL it's easy to identify the SQL_ID.
This SQL_ID has 3 childs with different plans. That is sufficient to check, if the specific times somehow match a flip of plans. This was done by a simle query:

with gash as (
select sql_id, sql_child_number, sample_time, LAG(sql_child_number, 1, 0) OVER (ORDER BY sample_time) AS prev_child
from gv$active_session_history
where sql_id='&sql_id'
order by sample_time
)
select *
from gash
where sql_child_number != prev_child
order by sample_time

For a longer observation dba_hist_active_sess_history can be used as well.

The result easily showed a flip between childs/plans at the given times.

(solution was to generate "outline-hints" with dbms_xplan.display_cursor for the good plan and so hint the SQL inside of some_function).

This was no complex task to analyze or big deal to execute. Just a small example where GUIs might not help so much. By the (little) information given it would have been pure luck to see the problematic pattern in a ASH-graph. As ther was nothing to filter, all the other "noise" in the DB would have wiped the information out.

Sometimes it's good to know the architecture and views, not only the GUI.

PS: The statement above is ugly. A MATCH_RECOGNIZE would be more elegant. Unfortunately this DB is 11.2

2018-05-28

enabling Database Vault is complicated

Enabling Database Vault in a already running system can be laborious.

I recently tried to enable DV in a (copy of a ) production Oracle EBS DB.
The Documentation is pretty lear and easy:
https://docs.oracle.com/database/121/DVADM/getting_started.htm#DVADM002
But I fail at

BEGIN 
DVSYS.CONFIGURE_DV ( 
dvowner_uname => 'DBV_OWNER', 
dvacctmgr_uname => 'DBV_ACCTMGR'); 
END; 
/ 
with
ERROR at line 1: 
ORA-47500: Database Vault cannot be configured. 
ORA-06512: at "DVSYS.CONFIGURE_DV", line 23 
ORA-06512: at "DVSYS.CONFIGURE_DV", line 72 
ORA-06512: at line 2 
That's not very helpful and ORA-47500 suggests:
If this happens, deinstall and then reinstall Oracle Database Vault
Even after removal and re-installation, the symptoms still stay.

The problem here are some invalid objects in the database!

As it's unclear to my which objects are the problem, I asked Oracle.
Unfortunately the answer is not helping a lot:

It is possible that not all [objects] need to be valid before you can configure DV successfully but you definitely need to work on fixing these invalid object at this point because that is the reason for the error.
If this happens, deinstall and then reinstall Oracle Database Vault.

Of course the suggestion doesn't help, utlrp did not change anything.
As all the invalid objects doesn't belong to the DBA but some developed code, it will take some time until DV is enabled.

I don't have any solution right now than laborious wading through all invalid objects and fixing them.

2018-04-30

Tracing the Database Configuration Assistant

I am struggling with DBCA right now so a little tracing is advised.
Note Tracing the Database Configuration Assistant (DBCA) (Doc ID 188134.1) gives a nice suggestion to add these parameters:
-DTRACING.ENABLED=true -DTRACING.LEVEL=2
but it's no elegant solution to edit the shell script in $ORACLE_HOME/bin/dbca

I prefer my new swiss knife for java parameters. A simple environment variable:
_JAVA_OPTIONS= -DTRACING.ENABLED=true -DTRACING.LEVEL=3
This is very useful for any java program. As Oracle has many such programs, there is often use for _JAVA_OPTIONS

2018-04-01

more fun with ages


On-call duties sometimes leads to interesting results.
I have to observe a restore running (or to be more precise wait for a tape library to find the right tapes and put them into drives.
This gives me time to catch up my news aggregator. And some time to play with interesting news.
One interesting post I stumbled across is

Franck Pachos After IoT, IoP makes its way to the database
There he mentions a new, yet undocumented function to_dog_year().
As I had some fun with ages in general, this promises to be even more entertainment!

If I had a dog with equivalent age then I'm old, what's it's birthday?
It's simple now:
SQL> alter session set nls_date_format='YYYY-MM-DD HH24:MI:SS';

select BX_YEAR_CALC.ADDITION(sysdate,   
         -1 * to_dog_year(BX_YEAR_CALC.DIFFERENCE(:MY_BIRTHDAY))) 
                                                as dogs_birthdate 
from dual;

DOGS_BIRTHDATE     
-------------------
2012-09-26 03:27:18

(I don't care about breed or this other parameter as it's only a hypothetical dog in my case).


As the library is still shuffling tapes I was guessing about the origin of this function.
Francks idea of Internet of Pets makes some sense to me, but such a very specific function?
I know stories about Oracle providing special versions of it's database to important customers. But that was back in something like Version 5 or 6 (long before my time) and more dedicated to performance issues of any kind. But maybe this is the case here also.
IF this important customer is doing any business with a huge amount of dogs and needs to calculate their "Age" all over the application, it might make sense to ask Oracle to code it in the core of the database. To have custom functions is very uncommon for me, but maybe we will see the next big thing soon, and Oracle tries to gain it's market share there?



Restore is done, but recovery takes it's time! I'm sure archivelogs are spread cross many tapes. This gave me time to do some (very simple!) performance tests with to_dog_year. It seems to be very easy to calculate  the "age" of a puppy, but this gets more expensive rapidly. At the input of 2 (or somewhere around - I increased input by 0.5) the complexity seems to drop and the effort increases only slightly.
It would be interesting to see the real formula, if it will ever get published, but at the input of 2 it seems to flip from a "calculation for young dogs" to "calculation for adult dogs".
If it's really for something like tracking dogs, I hope they (whoever they are) have a good capacity planning: I assume customers with young dogs will be more interested to join the service (whatever it will be) which makes the computation requirements slightly higher than average.

Anyhow, future will tell.
As the restore finished, I'll continue the night with better things to do...


Update:
If you are interested in more details about to_dog_year(), before you invest any effort there, please check date and time when this post was published.
I hope you liked it as much as we did.

2018-03-31

spfile parameters derived from cpu_count

In Oracle databases some spfile parameters are derived based on others, if they are not set explicit.
One of the main drivers I discovered is CPU_COUNT.
So I did some tests based in the past for version 11.1.0.7 and 12.1.0.2.
To simulate a machine with sufficient CPUs, I had to use a trick described in a previous post. Using _disable_cpu_check threw an ORA-7445:[ksucre] in 12.2 with cpu_count >126.

the script I used this time is basically (of course with 180 instead of 122 for 18.0).
#!/usr/bin/ksh  -x

integer i=1
while ((i <= 255));
do
  print " $i ";

  echo "*.audit_file_dest='/_some_dir_/BX1/adump'
*.audit_trail='db'
*.compatible='12.2.0.1.0'
*.control_files='/_some_dir_/BX1/controlfile/o1_mf_djd017z2_.ctl', '/_some_dir_/fast_recovery_area/BX1/BX1/controlfile/o1_mf_djd01834_.ctl'
*.db_block_size=8192
*.db_create_file_dest='/_some_dir_/oradata'
*.db_name='BX1'
*.db_recovery_file_dest='/_some_dir_/fast_recovery_area/BX1'
*.db_recovery_file_dest_size=7G
*.db_unique_name='BX1'
*.diagnostic_dest='/appl/oracle'
*.remote_login_passwordfile='EXCLUSIVE'
*.sga_target=7g
*.undo_tablespace='UNDOTBS1'
*.max_string_size='EXTENDED'
#*._disable_cpu_check='TRUE'
*.cpu_count=$i" > /_some_dir_/cpus/initBX1_$i.ora

  echo "shutdown abort;
startup pfile='/_some_dir_/cpus/initBX1_$i.ora' ;

insert into init_parameters
select '12.2'     ,
       $i         ,
       a.ksppinm  ,
       b.ksppstvl ,
       c.ksppstvl
from x\$ksppi a, x\$ksppcv b, x\$ksppsv c
where a.indx = b.indx and a.indx = c.indx;

commit;

select distinct(\"CPUs\") from init_parameters;

shutdown;
exit;"> cpus/122doit$i.sql

sqlplus "/ as sysdba" @cpus/122doit$i.sql >cpus/122log$i.log

sleep 5

  (( i = i + 1));
done


11.1.0.712.1.0.212.2.0.118.0.0.0
_cursor_db_buffers_pinnedxxxx
_db_block_buffers xxxx
_db_block_lru_latches xxxx
__db_cache_sizexxxx
_db_file_exec_read_count xx
db_file_multiblock_read_countxx
_db_handlesxxx
db_writer_processesxxxx
dml_locksxxx
_enqueue_hashxxx
_enqueue_hash_chain_latchesxxxx
_enqueue_locksxxxx
_enqueue_resourcesxxx
_flashback_generation_buffer_sizexxxx
_gc_fast_index_split_waitx
_gcs_min_cr_slavesxx
_kghdsidx_countxx

_hang_resolution_scopexx
_ksmsq_hintmaxprocx
__large_pool_sizexxx
_lms_rollbacksxx
log_bufferxxxx
_log_parallelism_maxxxxx
_log_simultaneous_copiesxxxx
_max_log_write_parallelismxx
_max_outstanding_log_writesxxx
_max_pending_scn_bcastsxxx
_messagesxxx
_num_longop_child_latchesxxxx
_num_rlslavesxx
parallel_max_serversxxxx
_parallel_min_message_poolxxxx
parallel_min_serversxxx
parallel_servers_targetxxx
pga_aggregate_limitxxx
_pmon_max_slavesxx
_pmon_slaves_arr_sizexx
processesxxx
_px_proactive_slave_alloc_thresholdxxx
_session_allocation_latchesxxx
sessionsxxx
sga_max_sizexx
shared_pool_reserved_sizexxx
__shared_pool_sizexxxx
_shared_server_num_queuesxxxx
_small_table_thresholdxxxx
_spin_count xxxx
_sqlmon_max_plan
xxx
_super_shared_conversion_threshold
xx
transactions xxx
_use_single_log_writer xxx

To get a good picture of the changes, feel free to select parameters, versions or whatever you want below.
Raw data is available at GistHub.
The js app is available here.
(It's based on an idea PerfSheet.js by Luca Canali.

As some of the values are not numbers, I applied this mapping:
TRUE1
FALSE0
OFF0
ADAPTIVE -1
INSTANCE -2


2018-03-26

Age - hard to calculate


The concept of age seems quite simple.
Right now I'm 42 years old.
I can be more precise and define it as 42 years, some months and several days.
And at a given date I will be 43 years old, and months and date are reset to 0.
Unfortunately it's hard to calculate with these numbers.
For those who wants to go into detail I'd recommend ISO 8601 Data elements and interchange formats. Unfortunately I don't have access to this document.
Another good reason is the explanation for CPANs DateTime.pm module.
I tried a slightly different approach:
Define the age as integer and fractional part, where the integer part is the age in years as we are used to it, and the fractional part is the  number of days already passed divided by the total days in the current year. 
This sounds quite simple, but it has some complicated effects due to leap years. So I wrote a little package ( see below) to handle these effects.
The Package is called BX_YEAR_CALC and has 2 functions:
DIFFERENCE (year1 date, year2 date) returns the AGE.
ADDITION (year1 date, diff number) returns the date at a specific AGE.

Here are some examples:

SQL> alter session set nls_date_format='YYYY-MM-DD HH24:MI:SS';

SQL> select bx_year_calc.difference('2004-02-29 12:00:00', '2005-03-01 11:59:00') diff from dual;

      DIFF
----------
1.00136796

SQL> select bx_year_calc.addition ('2004-02-29 12:00:00',  1.00136796042617960426179604261796042618) d from dual;

D
-------------------
2005-03-01 11:59:00

SQL> select bx_year_calc.addition ('2005-03-01 11:59:00', -1.00136796042617960426179604261796042618) d from dual;

D
-------------------
2004-03-01 00:00:00


You can see in these examples there are situations - especially around leap years and Feb 29th - where the calculation goes quite wrong. My implementation tries to reduce this effect to a minimum.

Feel free to play with dates and ages, any comments are very welcome.


create or replace PACKAGE BX_YEAR_CALC AS 

/*  2018-03-25 - berx - initial - martin.a.berger@gmail.com
    BX_YEAR_CALC - to calculate DATES in units or YEARS
                   equal idea as  DATE1 - DATE2 returns a number,
                   where the integer part are full days and fractional part  reflects remaining hours, minutes and seconds
                   
                   As some years has 365 or 366 days, the last year can have different number of total days, 
                   so the fraction must be calculated based on this.
                   Only the number of days of the last year (based on the begin) is calculated.
                   
                   Be aware: there are situations with leap years or gregorian calendar where results are not intuitive.
                   If you find a bette rimplementation for a given situation, let's discuss

*/


/* returns the "age in years" between the 2 parameters. 
   integer is quite obvious - it's the years between the start date and the same day (if it exists) in target year.
   fraction is remaining days divided by days in the last year
*/
  FUNCTION DIFFERENCE(YEAR1 date default sysdate,
                      YEAR2 date default sysdate)
    RETURN number;

/* adds (or substracts - if number is negative) a "age in years" as defined by DIFFERENCE to a given date
   for addition, first the integer part is added and then the fraction - based on days in last year.
   for subtraction, first the fraction is removed (to be more aligned to addition) and the the integer
   */
  FUNCTION ADDITION (YEAR1 date default sysdate,
                DIFF number default 0)
    RETURN date;

/* returns the next "existing" date to a given "old-date" and a specific other year ("new year")
   this handles the situation similar to "today is 2004-02-29. what's the date a year ago?
   to avoid ORA-01840: input value not long enough for date format
   this function searches the next day "above" (with incr => default 1) or "below" (with incr => -1)
   other values for incr are not defined.
   */
    FUNCTION next_real_date (
        new_year VARCHAR2,
        old_date DATE,
        incr number default 1
    )
    return date;

END BX_YEAR_CALC;

CREATE OR REPLACE PACKAGE BODY bx_year_calc AS

    c_debug          NUMBER := 0;

    PROCEDURE debug (
        message VARCHAR2
    )
        AS
    BEGIN
        IF
            c_debug > 0
        THEN
            dbms_output.put_line(message);
        END IF;
    END debug;

    FUNCTION next_real_date (
        new_year   VARCHAR2,
        old_date   DATE,
        incr       NUMBER DEFAULT 1
    ) RETURN DATE AS

        year2_year    NUMBER;
        return_date   DATE; -- the highest full year related to YEAR1  which is smaller than YEAR2  
        year_diff     NUMBER;
        leap_help     BOOLEAN := false; -- do we hit a leap-year issue and need to sort out things?
        leap_count    NUMBER := 0; -- how many days to add to fix leap issue
        date_not_valid EXCEPTION;
        PRAGMA exception_init ( date_not_valid,-1839 );
    BEGIN
      if NOT abs(incr)=1 then
      raise_application_error( -20001, ' incr msut be +1 or -1 ' );
      end if;
        WHILE NOT leap_help LOOP
            BEGIN
                IF
                    leap_count = 0
                THEN
                    return_date := TO_DATE(new_year || '-' || TO_CHAR(old_date,'MM-DD HH24:MI:SS'),'YYYY-MM-DD HH24:MI:SS');

                    leap_help := true;
                ELSE
                    return_date := TO_DATE(new_year || '-' || TO_CHAR(old_date + leap_count,'MM-DD'),'YYYY-MM-DD');

                    leap_help := true; -- this happens only if NO exception is thrown
                END IF;

            EXCEPTION
                WHEN date_not_valid -- ORA-01839: date not valid for month specified
                 THEN
                    leap_count := leap_count + incr;

                    IF -- just a safety in case anything goes wrong 
                        abs(leap_count) > 33
                    THEN
                        leap_count := 1 / 0;
                    END IF;
            END;
        END LOOP;

        RETURN return_date;
    END next_real_date;

    FUNCTION difference (
        year1   DATE,
        year2   DATE DEFAULT SYSDATE
    ) RETURN NUMBER AS

        days_lastyear    NUMBER; -- how many days the last year has
        fract_lastyear   NUMBER; -- remaining part as fraction of a year
        year2_year       NUMBER;
        hi_year_lo       DATE;   -- the highest full year related to YEAR1  which is smaller than YEAR2  
        hi_year_hi       DATE;   -- the smallest full year related to YEAR1  which is higher than YEAR2  
        year_diff        NUMBER;
        leap_help        BOOLEAN := false; -- do we hit a leap-year issue and need to sort out things?
        leap_count       NUMBER := 0;      -- how many days to add to fix leap issue
        date_not_valid EXCEPTION;
        PRAGMA exception_init ( date_not_valid,-1839 );
    BEGIN
        IF
            year1 > year2
        THEN
            RETURN difference(year2,year1);
        ELSE
            year2_year := to_number(TO_CHAR(year2,'YYYY') );
            hi_year_lo := next_real_date(TO_CHAR(year2_year,'0000'),year1);
            hi_year_hi := next_real_date(TO_CHAR(year2_year + 1,'0000'),year1);
      
      -- how many days are in the last year to take care of
            days_lastyear := hi_year_hi - hi_year_lo;
            fract_lastyear := ( year2 - hi_year_lo ) / days_lastyear;

            year_diff := to_number(TO_CHAR(hi_year_lo,'YYYY') ) - to_number(TO_CHAR(year1,'YYYY') );

            RETURN year_diff + fract_lastyear;
        END IF;
    END difference;

    FUNCTION addition (
        year1   DATE DEFAULT SYSDATE,
        diff    NUMBER
    ) RETURN DATE AS

        year_year        NUMBER;
        year_diff        NUMBER;
        days_lastyear    NUMBER; -- how many days the last year has
        hi_year_lo       DATE;   -- the highest full year related to YEAR1  which is smaller than YEAR2  
        hi_year_hi       DATE;   -- the smallest full year related to YEAR1  which is higher than YEAR2  
        lo_year_lo       DATE;   -- a temp "lowest date" - only to keep the calculation somehow readable
        days_jump_back   NUMBER := 15; -- a number bigger than "365 + (15-4)" - google gregorian calendar reform 1582 & 4 October 1582 
        days_offset      NUMBER;
        leap_help        BOOLEAN := false; -- do we hit a leap-year issue and need to sort out things?
        leap_count       NUMBER := 0;      -- how many days to add to fix leap issue
        date_not_valid EXCEPTION;
        PRAGMA exception_init ( date_not_valid,-1839 );
    BEGIN
        IF
            diff < 0
        THEN
          -- first let's substract only the fraction of diff
            year_year := to_number(TO_CHAR(year1,'YYYY') );
            hi_year_lo := next_real_date(TO_CHAR(year_year - 1,'0000'),year1);
            days_lastyear := year1 - hi_year_lo;
            hi_year_hi := hi_year_lo + ( days_lastyear * ( diff - trunc(diff) ) );
          -- now the easy part - years
            RETURN next_real_date(TO_CHAR(to_number(TO_CHAR(hi_year_hi,'YYYY') ) + trunc(diff) + 1,'0000'),hi_year_hi);
          -- trunc(diff)+1 is required, as we substracted "-1" in the calculation of "hi_year_lo" above already

        ELSIF diff = 0 THEN
            RETURN year1;
        ELSE
            year_year := to_number(TO_CHAR(year1,'YYYY') ) + trunc(diff);
            hi_year_lo := next_real_date(TO_CHAR(year_year,'0000'),year1);
            hi_year_hi := next_real_date(TO_CHAR(year_year + 1,'0000'),year1);
-- 
            days_lastyear := hi_year_hi - hi_year_lo;
            RETURN hi_year_lo + ( days_lastyear * ( diff - trunc(diff) ) );
        END IF;
    END addition;

END bx_year_calc;

2018-03-20

real virtual CPUs

Some software changes it's behavior based on capabilities of the system it's running on.
but sometimes it's interesting to check how a software would heave on a different system, which is not at hand right now.

On Linux, a lot of information about the current system can be found in /proc and /sys.
These filesystems are virtual, so they can not changed easily with an editor.

In my case I want to simulate  a lot more CPUs.
These are visible in several locations.
The most know is probably /proc/cpuinfo.  There you find a block of information for each CPU the kernel knows about. Based on the current configuration, I create a modified fake file somewhere in a different space:
#!/bin/bash
# cpus.sh

count=0
max_socket=8
max_core=32

END=5
for ((soc=0;soc<max_socket;soc++)); do
    for (( cor=0;cor<max_core;cor++)); do
echo "processor       : $count
vendor_id       : GenuineIntel
cpu family      : 6
model           : 37
model name      : Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz
stepping        : 1
microcode       : 0x3b
cpu MHz         : 2596.103
cache size      : 25600 KB
physical id     : $soc
siblings        : $max_core
core id         : $cor
cpu cores       : $max_core
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes hypervisor lahf_lm ida arat epb pln pts dtherm pti tsc_adjust
bugs            : cpu_meltdown spectre_v2
bogomips        : 5193.98
clflush size    : 64
cache_alignment : 64
address sizes   : 42 bits physical, 48 bits virtual
power management:

"
    let count=count+1
    done
done
and create a file with ./cpus.sh>cpuinfo.256

There is another location as well: /sys/devices/system/cpu.
In this directory are several directories and files with interesting information.

I copy the fill directory to another place (ignoring all the errors).
First the number of cpu[id] directories might need adjustment.
In my case a simple set of symlinks is sufficient:
for i in {2..255} ; do
  echo $i
  ln -s cpu1 cpu${i}
done  
In every cpu[id] durectory there is a symlinik to which node it belongs: node0 -> ../../node/node0
So it might be required to spoof proper entries in /sys/devices/system/node. In my case it's not required.

The last fix required in my case is in the file cpu/online.
It contains 0-255 now (instead of 0-2).

As I mentioned above the original files can not be manipulated as they are not real files.
The mount option --bind does the trick:
mount --bind <my_working_dir>/cpuinfo.256 /proc/cpuinfo
mount --bind <my_working_dir>/cpu /sys/devices/system/cpu

After these nice manipulations, my sandbox Oracle instance shows now plenty of CPUs:
SQL> show parameter cpu_count                                                                                                                                                      

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cpu_count                            integer     256

Update (2018-03-21):
For Oracle Databases I got 2 hints how to make it calculate with more CPUs than really available.

with this small stap script:
#!/usr/bin/stap
function modify_rax() %{ long ret; ret = 6; memcpy( ((char *)CONTEXT->uregs) + 80 , &ret, sizeof(ret)); %}
probe process(“oracle”).function(“skgpnumcpu”).return { modify_rax(); }

and




2018-03-11

Method R Profiler on x86_64 Linux with HiDPI


For a new installed Linux Laptop I tried to install Method R Profiler (and Tools).
But it was not as smooth as expected.

mrprop_wrapper failed with

/opt/mrprof/mrprof_wrapper: line 45: /opt/mrprof-7.1.1.3-linux32/bin/mrprof.exe: No such file or directory
But the file is there, and it's executable.
So the Error is slightly misleading.
Also strace showed a similar error:
execve("/opt/mrprof-7.1.1.3-linux32/bin/mrprof.exe", ["/opt/mrprof-7.1.1.3-linux32/bin/"...], [/* 62 vars */]) = -1 ENOENT (No such file or directory)
but execve(2) is slightly more clear:
ENOENT The file filename or a script or ELF interpreter does not exist, or a shared library needed for the file or interpreter cannot be found

As I'm very sure the file exists, let's check the other possibility:
$ file mrprof.exe 
mrprof.exe: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5, dynamically linked, interpreter \004, stripped
The file is a ELF 32-bit

$ uname -m
x86_64
The system is 64-bit


$ readelf -l -d  mrprof.exe 

Elf file type is EXEC (Executable file)
Entry point 0x804a878
There are 8 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  PHDR           0x000034 0x08048034 0x08048034 0x00100 0x00100 R E 0x4
  INTERP         0x000134 0x08048134 0x08048134 0x00013 0x00013 R   0x1
      [Requesting program interpreter: /lib/ld-linux.so.2]
  LOAD           0x000000 0x08048000 0x08048000 0x10a60 0x10a60 R E 0x1000
  LOAD           0x011000 0x08059000 0x08059000 0x014e8 0x015c8 RW  0x1000
  DYNAMIC        0x011014 0x08059014 0x08059014 0x000f8 0x000f8 RW  0x4
  NOTE           0x000148 0x08048148 0x08048148 0x00020 0x00020 R   0x4
  GNU_EH_FRAME   0x010a10 0x08058a10 0x08058a10 0x00014 0x00014 R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4

Dynamic section at offset 0x11014 contains 26 entries:
  Tag        Type                         Name/Value
 0x00000001 (NEEDED)                     Shared library: [libnsl.so.1]
 0x00000001 (NEEDED)                     Shared library: [libdl.so.2]
 0x00000001 (NEEDED)                     Shared library: [libm.so.6]
 0x00000001 (NEEDED)                     Shared library: [libcrypt.so.1]
 0x00000001 (NEEDED)                     Shared library: [libutil.so.1]
 0x00000001 (NEEDED)                     Shared library: [libpthread.so.0]
 0x00000001 (NEEDED)                     Shared library: [libc.so.6]
 0x0000000c (INIT)                       0x804a3b0
 0x0000000d (FINI)                       0x8057070
 0x00000004 (HASH)                       0x8048168
 0x00000005 (STRTAB)                     0x8049460
 0x00000006 (SYMTAB)                     0x80487b0
 0x0000000a (STRSZ)                      2697 (bytes)
 0x0000000b (SYMENT)                     16 (bytes)
 0x00000015 (DEBUG)                      0x0
 0x00000003 (PLTGOT)                     0x8059110
 0x00000002 (PLTRELSZ)                   592 (bytes)
 0x00000014 (PLTREL)                     REL
 0x00000017 (JMPREL)                     0x804a160
 0x00000011 (REL)                        0x804a140
 0x00000012 (RELSZ)                      32 (bytes)
 0x00000013 (RELENT)                     8 (bytes)
 0x6ffffffe (VERNEED)                    0x804a080
 0x6fffffff (VERNEEDNUM)                 3
 0x6ffffff0 (VERSYM)                     0x8049eea
 0x00000000 (NULL)                       0x0

The required program interpreter is missing:
# ls  /lib/ld-linux.so.2
ls: cannot access '/lib/ld-linux.so.2': No such file or directory

Unfortunately, the documentation is slightly unspecific regarding the requirements: glibc 2.14 is met, but in my system only 64 bit by default.
With a glimpse at this part of the installation path mrprof-7.1.1.3-linux32 it should be obvious, but it wasn't for me, and the error-message was not helpful at first stage.

The solution is quite simple. As root:
dpkg --add-architecture i386
apt-get update
apt-get install libc6:i386 



With this issue solved, the Profiler started, but with incredible tiny fonts. This is a known effect for HiDPI screens. As it's a java- program, one additional line to my profile solves the problem:
export _JAVA_OPTIONS='-Dsun.java2d.uiScale=2'
The start screen still looks like a stamp, but the fonts and windows are fine.



At last, I want to start the profiler from any location, not only it's home directory. So the Profiler.sh is extended to
script="$0"
WORK_DIR="$(dirname $script)"
java -jar "${WORK_DIR}/jlib/com.methodr.profiler.gui.jar" $@
(the WORK_DIR related stuff is new.)

All together nothing special, but it can cost some time when you want to use a tool for the first time - probably for urgent reasons - and first have to fix such obstacles.
So it's collected here, to save me some time when I hit it again.