Optimizing Oracle Performance
Optimizing Oracle Performance, by Cary Millsap with Jeff Holt, defines the method that Hotsos-trained professionals use to optimize Oracle performance faster and more permanently than they had ever imagined.
You can read a sample chapter or order the book directly from the publisher.
Here is what Oracle Corporation has said about the project:

Structure of the Book
The book is divided into four parts, with twelve chapters and five appendixes in total.
Part I, Method, is about targeting. It is written in an informal, narrative style that sponsors and managers of performance improvement projects can read from front to back without getting distracted by a lot of technical details. It includes the following chapters:
Chapter 1, A Better Way to Optimize, explains why Oracle performance improvement is so difficult using conventional methods. It explains three important advances from other industries that Oracle performance analysts have ignored for decades. Finally, it describes the new performance improvement method to which the remainder of the book is dedicated.
Chapter 2, Targeting the Right User Actions, describes why many performance improvement projects are doomed from the beginning by poor project specifications. It explains how to construct a foolproof specification for your performance improvement project.
-
Chapter 3, Targeting the Right Diagnostic Data, describes how errors in diagnostic data collection are the root cause of many failed performance improvement projects. It describes why many projects cannot ever succeed without properly scoped diagnostic data, and it introduces three distinct sources of such information on Oracle systems.
Chapter 4, Targeting the Right Improvement Activity, explains how performance improvement projects can be held to the same standard of informed consent that other scientific endeavors require. It describes how to forecast performance improvement project costs and benefits and how to find the economically optimal performance improvement activity from the universe of things you could do about system performance.
Part II, Reference, is about detail. It is written in a deeply technical style in which I try to provide what a performance analyst needs to implement the method. It contains the following chapters:
Chapter 5, Interpreting Extended SQL Trace Data, describes the content of an Oracle extended SQL trace file. It describes the meaning of fields in a trace file, and it explains the relationships of time statistics throughout a trace file.
Chapter 6, Collecting Extended SQL Trace Data, explains how to collect the properly scoped extended SQL trace data that you'll need to analyze a performance problem.
Chapter 7, Oracle Kernel Timings, explains how software like the Oracle kernel measures itself and how you can verify the self-diagnostic behavior of your own system. It goes on to explain several sources of unaccounted-for time in Oracle trace files and why these lapses in timing data often contain performance diagnostic data in and of themselves.
Chapter 8, Oracle Fixed View Data, explains some of the many deficiencies of Oracle's dynamic performance views. It presents descriptions of several popular V$ fixed views and examples of their use. You might be surprised to find out that some of the things you thought you knew about Oracle's dynamic performance views are untrue.
Chapter 9, Queueing Theory for the Oracle Practitioner, is one of my favorites. It explains the physical phenomenon of queueing and how to use the body of mathematical knowledge called queueing theory to understand and even predict the performance of systems including Oracle database applications.
Part III, Deployment, returns to the informal, narrative style that I hope will encourage project sponsors and managers to follow along. It covers the issues of how to complete the job for maximal positive impact.
Chapter 10, Working the Resource Profile, describes a step-by-step method for analyzing Oracle response time data that leads to maximized performance improvement at minimized cost. It describes the tremendous economic benefits of waste removal and explains how to think “outside the box” to achieve performance improvements you might never otherwise have considered. Finally, it explains how to tell when your performance optimization work is complete, a task that is astonishingly difficult in conventional performance improvement methods.
Chapter 11, Responding to the Diagnosis, describes how to improve application performance in response to various patterns you'll find in your performance diagnostic data. It places particular emphasis upon how to eliminate wasteful work from your system, and it covers important response time components that are documented either poorly or not at all in other works.
Chapter 12, Case Studies, is the capstone chapter of the book. It documents four complete cases from problem identification, through the targeting, analysis, and deployment processes, to show you exactly how the method works in real life.
Errata
- Page xv, first paragraph:
The URL to Jared Still's site is missing a 'c'. It should read http://www.cybcon.com/~jkstill/util/util_master.html.
- Page xxv
Missing names include: Brad and Vonda McFarling, George Polisner, Zach Friese.
- Page 11, Example 1-2:
There should be no blank line after the SQL*Net message from client... line.
- Page 32, Reliability item:
Method R performs reliably in virtually every performance problem situation imaginable; ....
- Page 79, lid items:
Tests in Oracle version 9 show that regardless of whether the execution was done via definer or invoker rights, the uid and lid always seem to show up in the trace data as the same value: the id of the user who was logged in and running the procedure.
- Page 83, value field definition, append the following paragraph after the existing paragraph:
When no value field is emitted to the trace file, it is an indication that the NULL value has been bound into the placeholder variable. For example, in the fnd_profile.get_specific sample shown earlier, the absence of a value field for bind 4 indicates that the application has provided the NULL value for the :val placeholder. The bind value length specification of zero (avl=00) is corroborative evidence.
- Page 96, Example 5-7, example input file content:
Line break positions should be...
# WAIT #2: ...
# WAIT #2: ...
# FETCH #2:...- Page 113, last paragraph:
The second occurrence of "TIMED_STATISTICS" is misspelled.
- Page 117, final code segment before the bear trap icon:
select sid, serial#
from v$session
...- Page 118, paragraph beside the bear trap icon:
Do not use DBMS_SYSTEM.SET_SQL_TRACE_IN_SESSION to activate extended SQL trace, because this procedure can activate SQL tracing only at level 1. You cannot activate extended SQL tracing with SET_SQL_TRACE_IN_SESSION.
- Page 120, Table 6-2:
The correct operating system for file MERKUR_S7_FG_ORACLE_013.trc is OpenVMS 7.2-1.
- Page 133, exercise 1:
The command to enable trace is listed as alter system set events... when it should be alter session set events.... The intention is to create a trace file for a single command in a single session, not turn on trace for all new sessions on the instance.
- Page 134, Example 6-10, seventh line from the bottom:
$frac = sprintf "%06.0f", $frac/(10**($1-6));
- Page 136, second paragraph from the bottom:
Notice the highlighted portion of the final tim field value: the trace file contains information about what happened up to time ...23.690992 (expressed in seconds), and in fact 4 μs afterward, ...
- Page 137, third paragraph:
In the Hotsos Sparky data collector, we execute a query that is similar to this one immediately prior to executing the commands to activate and deactivate tracing.
- Page 149, next-to-final sentence before Example 7-3:
Strike the term user-mode: The difference between the two marks (c0 and c1) is the approximate amount of CPU capacity that was consumed by the database call.
- Page 157, first paragraph:
It's the Heisenberg uncertainty principle, not principal.
- Page 157, Table 7-1 and the inequality above it:
Centisecond times should be rendered only to five significant digits, not seven. For example, 2.0000 cs ≤ ea ≤ 2.9999 cs. This is because gettimeofday returns information that is precise only to 1 μs, which is 0.0001 cs.
- Pages 168–169:
Change all (seven) occurrences of the string “265” to “266”, and add a zero in the final number of the sidebar at the top of page 169: 100,000,000,000,000,000,000,000,000.
- Page 183, “Susceptibility to Overflow and Other Errors,” second sentence should read:
The problem is that an n-bit counter variable can store only 2n distinct values.
- Page 203, Example 8-3 $Header line:
Program name should be vprof.pl, not vproP.pl.
- Page 212, First paragraph in "The denominator problem":
"This SQL*Plus program is an attempt to produce a true resource profile...".
- Page 215, first paragraph:
A big part of the problem is a principle that....
- Page 231, Figure 9-6:
The label “Previous arrival” should say Next arrival, as shown here:

- Page 238, Figure 9-8 and caption:
Because the Poisson distribution is a discrete function (defined only for x = 0, 1, 2, ...), it is not correct to say that the distribution has a probability distribution function. Rather, a discrete distribution has a probability mass function. Instead of cluttering the text with a new definition for pmf, the drawing and caption can be corrected as follows:

Figure 9-8. The probability density function (pdf) for the function that generates the Poisson distribution with λ = 2 shows the probability P(A = x) that there will be exactly x arrivals in a one-second observation interval
- Page 268, "M/M/m/∞/FCFS (exponential service time)" definition:
Similarly, you must ensure that....
- Page 288, “Why targeting is vital,” first paragraph:
...(the one corresponding to the resource profile in Example 10-2, ...).
- Page 290, third paragraph:
...we were actually taught principles that encourage waste.
- Page 309, "Large c value for a PARSE call" definition:
If you have many small c values distributed across ....
- Page 310, third line from the bottom:
...or swapping, then patch your Oracle kernel so that your user action's code...
- Page 339, Fourth paragraph, second line:
...we started looking through our Hotsos Profiler output.
- Page 343, First paragraph:
Using bind variables and making the code look like the scalable application code shown in Example 11-2 (p322) will result in....
- Page 358, “Methodology” entry:
The word “been” is missing. Furthermore, the quoted matter should be offset from the surrounding text in a manner similar to that shown in the Oracle connection entry on page 359:
In recent years, however, the word “methodology” has been used as a pretentious substitute for “method” in scientific and technical contexts.... The misuse of “methodology” obscures an important conceptual distinction between the tools of scientific investigation (properly “methods”) and the principles that determine how such tools are deployed and interpreted—a distinction that the scientific and scholarly communities, if not the wider public, should be expected to maintain.
- The following index entries on pages 381–388 are missing or incorrect:
buffer busy wait events, 54, 317
Central limit theorem, 169
DBMS_PROFILER, 309
Greek alphabet, motive for use, 228
Hotsos Profiler, xv, 61
Kyte, Tom, 55, 183, 194–196
Laplace, Pierre Simon de, 169
ora_check_sql, 322, 366
prof-cid, 95–96
SET_SQL_TRACE_IN_SESSION procedure, 118
_SPIN_COUNT, 320–321
STAT, 83–84
Still, Jared, xv
tim.pl, 134
tkprof, xv, 61, 320
Trace Analyzer, 61
trcsummary, 61
XCTEND, 84
