It is always recommended we have a good table statistics to get a good explain plan which in turn provides us better execution
But do you think almost perfect (+ or – 5% change) stats always gives good plan ?
You will be surprised to see the difference it makes
I have a below table with more or less good stats
Lets check the count on the table :
SQL> select count(*) from NMANOJ.TEST_TABLE;
COUNT(*)
————
178183
SQL> select num_rows,last_analyzed from dba_tables where table_name=’TEST_TABLE’;
NUM_ROWS LAST_ANALAYZED
—————— ——————-
173052 05-NOV-19
We see a difference of 5000 rows difference between the actual rows and the statistical count
We use autotrace traceonly explain , which doesn’t actually runs the command instead it predicts the plan only with the help of existing stats
Here you can see the oracle is predicting the number of rows to be returned would be 2834
How do we land up to this number without executing the query just depending on the statistics
Oracle has a concept called NDV (Number of Distinct Values) , In this scenario we are executing the query with condition CALC_RESULT=1
The Number of rows on the table is 173052 and oracle thinks the number of distinct values on column CLAC_RESULT to be 61
The expected number of rows to be returned by the given condition =NUmberof Rows/NDV
No of rows=173052/61 which is close to 2836
Lets check the actual value for the given condition
Now we have almost perfect Stats but still we still we see a huge difference in rows predicted and rows retrieved , The reason being is we don’t have any idea how the data is distributed among the distinct values , Which value is repeated more and which value is not repeated much
Lets gather stats again
We will see the below Magic by collecting Stats
Lets try explain plan again
This is almost close to the actual rows much better compared to what we had earlier 2834
Note that there is no change in plan ,except you get accurate rows fetched by the statement
What changed before and after Stats 🙂
We can see the Histogram is created on the Column CALC_RESULT automatically by Oracle while stats was gathered because the NDV is less compared to Number of rows
Histogram helps to understand how the data is distributed on a column where predicate(where condition) is used very frequently
Practical Example :
Running a sales report on a table with million rows with where condition on columns like CITY or Pincode
Lets see more details on upcoming blogs how we get almost exact value 🙂
Perfectly explained 🙂
LikeLike
Hi Nikhil,
Thank you for your kind words 🙂
LikeLike