Fishing Arround

mpruet's picture



Grouper Threads

Phases of the
Grouper Evaluator

No - this is not about some fish.  Rather, this is about the
within IDS which regroups the logical log records into a transaction
for replication, evaluates the rows to determine what should be
replicated and where it should be replicated to, and then places the
replicated transaction into the send queue for transmission to the
target servers.


The grouper is composed of two parts.  The first part consists
of the grouper fanout thread (CDRGfan).
 The purpose of the grouper fanout thread is to

  • Receive reconstituted log records from the log snooper ( style="font-weight: bold;">ddr_snoopy)
  • Regroup the transaction (i.e. attach the log record to the
    appropriate transaction)
  • Pass the log records to the grouper evaluator for evaluation
  • Determine if the transaction is consuming too many
    resources and
    needs special treatment such as it's own memory pool and/or needs to be
  • Place the transaction into the grouper serial list.
     This is
    done when the commit record is processed and is used to ensure that the
    transaction is placed into the send queue by commit order.

The second part of the grouper is the grouper-evaluator.  The
grouper-evaluator consists of several threads whose names begin
with "CDRGeval__".  The purpose of the grouper
evaluator is

  • Evaluate the log record to determine if it is a candidate
    for prorogation
  • Reconstitute the transaction from the logical logs
  • Compress the transaction by the removal of any duplicate
    operations on the same row
  • Determine the original 'before image' and the final 'after
    image' of any update operation
  • Queue the replicated transaction for transmission to the
    various targets
  • Record any deleted rows in the shadow delete table 

It is fairly obvious that the Grouper-Evaluator is a
fairly critical component of ER.  Because of that it is rather
critical that it be as
streamlined as possible.  Otherwise, it would not be able to
process the log records quickly which would cause a back flow into the
log snooping process.  And a back flow into the log snooping
cause a significant impact on overall latency.  So it is
important that grouper be able process the log records
quickly and avoid having to do disk IO.  

of the Grouper Evaluator


The first phase of the grouper evaluator is the evaluation phase.
 Duing this process one of the grouper evaluator threads will
examine the log record to determine if it is a candidate for
replication.  If it is not, then the row is immediatly
released.  Generally the grouper will evaluate rows as the
transaction log buffers are being flushed to disk.  That means
that there is generally no physical IO involved in obtaining the rows.
 This means that if the transaction performs
operations on multiple rows, it is possible that grouper may have
evaluated the log records before the commit for the transaction has
occurred.  This would generally be the case if the commit of
transaction is in a different log buffer than the other operations of
the transaction.  However, the grouper does not place the
transaction into the send queue until it has processed the commit
record and all rows of the transaction have been evaluated.

The grouper evaluation is performed in parallel.  By that I
mean that one log record of the original transaction might be evaluated
by one of the grouper threads while another log record of the same
transaction can be evaluated by another thread.  This makes it
possible for the evaluator to remain fairly current with the current
log position.  


Once the commit record has been processed, the grouper goes through a
compression phase.  This involves determining all of the
operations for a given row within the transaction and eliminating any
unnecessary operations.  For instance, if a row was updated
multiple times within the transaction, the duplicate operations will be
eliminated and only the original before and after image will be saved.
 If a row was inserted in a transaction and then deleted
that same transaction, then it will not even be replicated.
process reduces the overall size of the transaction which will be
placed into the send queue.  

Additionally, the compression phase is a requirement for transmitting
the correct operation.  There are many examples where the
operation can not be transmitted by using the same operation as was
performed on the source.  For instance, suppose a replicate
defined with a filter - say "select
* from payroll where  status_column = 4"
  Now suppose the following command was issued

update payroll set
status_column = 4 where emp_no = 23412;

Unless the before image of the row had a status_column of 4, then the
target would not have the existing before image as the before image was
not a member of the replicated set of data.  Therefor, when
update operation was replicate, we would need to replicate it as an
insert, not as a delete.

Likewise, suppose the following statement was issued:

update payroll set
status_column = 3 where emp_no = 23412;

If the before image of the row had the status_column set to 4, then the
update operation would be removing the row from the set of replicated
data because the filter used to define the replicate is no longer a 4.
 That means that the update operation would need to be
as a delete operation.


The final phase of the grouper evaluator is the copy phase.
 During this time, the replicated transaction is placed into
the send queue for transmission to the target nodes.  Although
the transaction may be transmitted to multiple targets, it is placed
into the send queue only once.   The transaction is placed
into the queue in a 'stream' format - which basically means that it is
put into a network-independent format.  That means that
objects such as user defined types are converted into a stream for
transmission to the target nodes. style="color: rgb(0, 0, 0); font-weight: bold;">

the Grouper Threads

The onconfig parameter used to configure the grouper evaluator threads
is CDR_EVALTHREADS  x,y where 'x' is the number of threads per
CPUVP and 'y' is a number of extra threads.  The default is
 I personally think that 1,2 is a good setting for the number
evaluator threads.  The theory is that we want to evaluate the
as quickly as possible.  Since the majority of work that the
grouper evaluator threads do is very light weight simple evaluation of
the log records, there is little cost with having one per CPUVP.
 Also this makes it easier to maintain a balance between the
logging work and the consumer of the logs.  

However, there is still the problem of having to maintain the local
shadow delete table.  Maintaining the delete table does
some blocking activity because we have to perform IO to the delete
table itself.  That will cause the grouper evaluator thread to
into a wait state, which can lead to lagging behind the consumption of
the logs.  That's why it still makes sense to have a couple of
extra evaluator threads.