Today I was looking at the Top SQL on an Oracle 11gR2 production database, and identified a statement taking 0.5s per execution. It was called frequently, so if I could tune it to something more reasonable, it would have an immediate impact on our users’ screen response time.
I began by taking a quick look at the statement using OEM. The statistics and activity tabs looked normal, so I moved on to browse the execution plan. There was a single line with just “SELECT” and no child steps! Hmm, probably a bug in the GUI, so let’s do things properly with DBMS_XPLAN. Running that from TOAD, I cancelled it after 80,000+ rows had been produced! Something is definitely odd about this statement.
So I checked V$SQL, and found 2,300+ rows for the single SQL_ID. Something I had not seen before is that there were only 100 child cursors (CHILD_NUMBER in the range 0..99). The CHILD_ADDRESS column proved to be unique, and there was an even distribution of 23 or 24 rows per CHILD_NUMBER. I read about a similar situation here. In summary, 11gR2 has a limit of 100 children per statement, but also keeps invalidated rows in V$SQL with OBJECT_STATUS = ‘INVALID_UNAUTH’. However DBMS_XPLAN doesn’t cope well with this situation.
The next step was to identify why the cursors were being invalidated after (it seemed) every execution. I looked in V$SQL_SHARED_CURSOR and could see that every row for the statement had a ‘Y’ in the USE_FEEDBACK_STATS column, which suggests that Cardinality Feedback was being used. This is only supposed to happen a few times and then settle down to a stable position, as described here and in other articles. For what it’s worth, the REASON column had this for every row:
<ChildNode> <ChildNumber>19</ChildNumber> <ID>3</ID> <reason>Optimizer mismatch(13)</reason> <size>3x4</size> <kxscflg>32</kxscflg> <kxscfl4>4194560</kxscfl4> <dnum_kksfcxe>262163</dnum_kksfcxe> </ChildNode>
From what I understand “Optimizer mismatch(13)” simply correlates with the USE_FEEDBACK_STATS = ‘Y’, and you would expect an ‘N’ in that column with a NULL reason for the next child cursor (as demonstrated here).
I did however stumble on a solution, which so far seems to be successful. For the statement in question, I simply ran the SQL Tuning Advisor and accepted its recommendation for a SQL Profile (suggesting a 99.99% benefit!). There is now one stable row in V$SQL with OBJECT_STATUS = ‘VALID’, it is using the SQL Profile and has a SQL Plan Baseline.
I found two other statements in the database exhibiting the same problem. The first of these was solved in the same way.
However the second was already performing well (<0.01s per execution) so the SQL Tuning Advisor did not provide a recommendation. Interestingly, the plans in V$SQL_PLAN for this statement had a value in the OTHER_XML column. It’s as though the Cardinality Feedback process generated the necessary hints, but for these few statements they were not picked up on subsequent executions. So I used the process described here (specifically the create_sql_profile.sql script) to manually create a SQL Profile from these hints. With that in place, the statement became stable like the others.
It will take a while for the obsolete rows to age out of V$SQL (I’m not going to flush the shared pool on production if I don’t have to!), but the counts are coming down and performance looks good.
Platform Information
The material in this article was most recently used with Oracle 11.2.0.3.0 on 64-bit Linux. Initialization parameters optimizer_capture_sql_plan_baselines and optimizer_use_sql_plan_baselines were both set to TRUE.